dialogy.plugins.text.duckling_plugin package¶
Module contents¶
We use Duckling for parsing values like: date
, time
,
numbers
, currency
etc from natural language. The parser will expect Duckling to be running as an http service, and provide
means to connect from the implementation here. Here’s a big example for various duckling entities.
Mother of all examples¶
In [1]: from dialogy.workflow import Workflow
...: from dialogy.base import Input
...: from dialogy.plugins import DucklingPlugin
...:
In [2]: duckling_plugin = DucklingPlugin(
...: dest="output.entities",
...: dimensions=[
...: "number",
...: "people",
...: "time",
...: "duration",
...: "intersect",
...: "amount-of-money",
...: "credit-card-number",
...: ],
...: # Duckling supports multiple dimensions, by specifying a list, we make sure
...: # we are searching for a match within the expected dimensions only.
...: )
...:
In [3]: workflow = Workflow([duckling_plugin])
In [4]: utterances = [
...: "between today 7pm and tomorrow 9pm",
...: "can we get up at 6 am",
...: "we are 7 people",
...: "2 children 1 man 3 girls",
...: "can I come now?",
...: "can I come tomorrow",
...: "how about monday then",
...: "call me on 5th march",
...: "I want 4 pizzas",
...: "2 hours",
...: "can I pay $5 instead?",
...: "my credit card number is 4111111111111111",
...: ]
...:
In [5]: %%time
...: input_, output = workflow.run(Input(utterances=utterances))
...:
CPU times: user 17.4 ms, sys: 2.84 ms, total: 20.3 ms
Wall time: 21.9 ms
In [6]: output
Out[6]:
{'intents': [],
'entities': [{'range': {'start': 0, 'end': 34},
'body': 'between today 7pm and tomorrow 9pm',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 0,
'entity_type': 'datetime',
'grain': 'hour',
'type': 'interval',
'from_value': '2022-08-24T19:00:00.000+00:00',
'to_value': '2022-08-25T22:00:00.000+00:00',
'value': {'from': '2022-08-24T19:00:00.000+00:00',
'to': '2022-08-25T22:00:00.000+00:00'}},
{'range': {'start': 14, 'end': 21},
'body': 'at 6 am',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 1,
'value': '2022-08-25T06:00:00.000+00:00',
'entity_type': 'time',
'grain': 'hour'},
{'range': {'start': 7, 'end': 15},
'body': '7 people',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 2,
'value': 7,
'unit': 'person',
'entity_type': 'people'},
{'range': {'start': 0, 'end': 10},
'body': '2 children',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 3,
'value': 2,
'unit': 'child',
'entity_type': 'people'},
{'range': {'start': 11, 'end': 16},
'body': '1 man',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 3,
'value': 1,
'unit': 'male',
'entity_type': 'people'},
{'range': {'start': 17, 'end': 24},
'body': '3 girls',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 3,
'value': 3,
'unit': 'female',
'entity_type': 'people'},
{'range': {'start': 11, 'end': 14},
'body': 'now',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 4,
'value': '2022-08-24T18:28:17.690+00:00',
'entity_type': 'datetime',
'grain': 'second'},
{'range': {'start': 11, 'end': 19},
'body': 'tomorrow',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 5,
'value': '2022-08-25T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'},
{'range': {'start': 4, 'end': 16},
'body': 'about monday',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 6,
'value': '2022-08-29T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'},
{'range': {'start': 8, 'end': 20},
'body': 'on 5th march',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 7,
'value': '2023-03-05T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'},
{'range': {'start': 7, 'end': 8},
'body': '4',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 8,
'value': 4,
'entity_type': 'number'},
{'range': {'start': 0, 'end': 7},
'body': '2 hours',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 9,
'value': 7200,
'unit': 'hour',
'normalized': {'value': 7200, 'unit': 'second'},
'_meta': {},
'entity_type': 'duration'},
{'range': {'start': 10, 'end': 12},
'body': '$5',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 10,
'value': 5,
'unit': '$',
'entity_type': 'amount-of-money'},
{'range': {'start': 25, 'end': 41},
'body': '4111111111111111',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 11,
'entity_type': 'credit-card-number',
'issuer': 'visa',
'value': '4111111111111111'},
{'range': {'start': 25, 'end': 41},
'body': '4111111111111111',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 0.08333333333333333,
'alternative_index': 11,
'value': 4111111111111111,
'entity_type': 'number'}],
'original_intent': {}}
Testing¶
Connect to the Duckling API either via a docker container or a local setup.
Boot an IPython session and setup an instance of DucklingPlugin.
1from pprint import pprint
2from dialogy.workflow import Workflow
3from dialogy.base import Input
4from dialogy.plugins import DucklingPlugin
5
6duckling_plugin = DucklingPlugin(
7 dest="output.entities",
8 dimensions=["people", "time"],
9 locale="en_IN",
10 timezone="Asia/Kolkata",
11 origin/master
12)
13
14entities = duckling_plugin.parse(["We are 2 children coming tomorrow at 5 pm"])
15pprint(entities)
16# [PeopleEntity(
17# body='2 children',
18# type='value',
19# parsers=['DucklingPlugin'],
20# score=1.0,
21# alternative_index=0,
22# alternative_indices=[0, 0],
23# latent=False,
24# value=2,
25# origin='value',
26# unit='child',
27# entity_type='people')
28# ,TimeEntity(
29# body='tomorrow at 5 pm',
30# type='value',
31# parsers=['DucklingPlugin'],
32# score=1.0,
33# alternative_index=0,
34# alternative_indices=[0],
35# latent=False,
36# value='2022-02-11T17:00:00.000+05:30', origin='value', entity_type='datetime', grain='hour')]
37
38# To convert these to dicts:
39
40entity_dicts = [entity.json() for entity in entities]
41pprint(entity_dicts)
42# [{'range': {'start': 7, 'end': 17},
43# 'body': '2 children',
44# 'type': 'value',
45# 'parsers': ['DucklingPlugin'],
46# 'score': 1.0,
47# 'alternative_index': 0,
48# 'value': 2,
49# 'unit': 'child',
50# 'entity_type': 'people'},
51# {'range': {'start': 38, 'end': 54},
52# 'body': 'tomorrow at 5 pm',
53# 'type': 'value',
54# 'parsers': ['DucklingPlugin'],
55# 'score': 1.0,
56# 'alternative_index': 0,
57# 'value': '2022-02-11T17:00:00.000+05:30',
58# 'entity_type': 'datetime',
59# 'grain': 'hour'
60# }]
Guards¶
Like all plugins, the Duckling plugin can be guarded by conditions that prevent its execution.
You may need this if you want to save on latency and are sure of not expecting entities in those turns. We will use
the mother of all examples from above to make the point. We will use the current_state
to tell the plugin
to guard itself in the “SMALL_TALK” state.
In [7]: from dialogy.workflow import Workflow
...: from dialogy.base import Input
...: from dialogy.plugins import DucklingPlugin
...:
In [8]: duckling_plugin = DucklingPlugin(
...: dest="output.entities",
...: dimensions=[
...: "number",
...: "people",
...: "date",
...: "time",
...: "duration",
...: "intersect",
...: "amount-of-money",
...: "credit-card-number",
...: ],
...: guards=[lambda i, o: i.current_state == "SMALL_TALK"]
...: )
...:
In [9]: workflow = Workflow([duckling_plugin])
In [10]: utterances = [
....: "between today 7pm and tomorrow 9pm",
....: "can we get up at 6 am",
....: "we are 7 people",
....: "2 children 1 man 3 girls",
....: "can I come now?",
....: "can I come tomorrow",
....: "how about monday then",
....: "call me on 5th march",
....: "I want 4 pizzas",
....: "2 hours",
....: "can I pay $5 instead?",
....: "my credit card number is 4111111111111111",
....: ]
....:
In [11]: %%time
....: input_, output = workflow.run(Input(utterances=utterances, current_state="SMALL_TALK"))
....:
CPU times: user 123 us, sys: 0 ns, total: 123 us
Wall time: 126 us
In [12]: output
Out[12]: {'intents': [], 'entities': [], 'original_intent': {}}
You can notice that the time drops from ms to us and we produce no entities.
Filtering Date and Time¶
There are cases where we run into ambiguity in entity resolution. A fun case happens usually with the Hindi language when someone says “कल”. The problem is, “कल” could be “tommorrow” or “yesterday”.
Problem¶
In [13]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [14]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time"],
....: )
....:
In [15]: workflow = Workflow([duckling_plugin])
In [16]: utterances = ["कल"]
In [17]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....:
In [18]: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
In [19]: output
Out[19]:
{'intents': [],
'entities': [{'range': {'start': 0, 'end': 2},
'body': 'कल',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2021-12-31T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'},
{'range': {'start': 0, 'end': 2},
'body': 'कल',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-02T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'}],
'original_intent': {}}
See the two entities?
We froze our reference time for 1st Jan 2022 at 12:00:00.
We get 31st December 2021
and 2nd Jan 2022.
This could be decoded using additional context. Let’s assume it is a given that dates can only be in future. Say a flight-booking case.
In [20]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [21]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time"],
....: datetime_filters=DucklingPlugin.FUTURE
....: )
....:
In [22]: workflow = Workflow([duckling_plugin])
In [23]: utterances = ["कल"]
In [24]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....:
In [25]: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
In [26]: output
Out[26]:
{'intents': [],
'entities': [{'range': {'start': 0, 'end': 2},
'body': 'कल',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-02T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'}],
'original_intent': {}}
We can use DucklingPlugin.PAST
for inverse.
What if the context was within the utterance? Say for payment verification cases, we receive these utterances:
“कल ही कर दी थी” – “I have already paid yesterday”
“कल कर दूंगा” – “I will pay tomorrow”
We can’t filter out either future or past values. In these cases, the intent <Intents> carries the context for resolving the entity value. We will look into Temporal Intents for more details.
Temporal Intents¶
Filtering¶
There are cases where intents carry a notion of time, repeating an example from above:
“कल ही कर दी थी” – “I have already paid yesterday”
“कल कर दूंगा” – “I will pay tomorrow”
While we can’t decode the entity value for “कल” in the above utterances, the correct intent prediction is:
already_paid
pay_later
If we could use this information, we could resolve the entity value for “कल”. We will simulate the output
to contain appropriate intents using the set
method of workflow.
In [27]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.types import Intent
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [28]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time"],
....: temporal_intents={
....: "already_paid": DucklingPlugin.FUTURE,
....: "pay_later": DucklingPlugin.PAST
....: }
....: )
....:
In [29]: workflow = Workflow([duckling_plugin])
In [30]: utterances = ["कल ही कर दी थी"]
....: workflow.set("output.intents", [Intent(name="already_paid", score=1.0)])
....:
Out[30]: Workflow(plugins=[<dialogy.plugins.text.duckling_plugin.DucklingPlugin object at 0x7fbaf91780d0>], input=None, output=Output(intents=[Intent(name='already_paid', score=1.0, alternative_index=None)], entities=[], original_intent={}), debug=False)
In [31]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
....:
In [32]: output
Out[32]:
{'intents': [{'name': 'already_paid',
'alternative_index': None,
'score': 1.0,
'parsers': [],
'slots': []}],
'entities': [{'range': {'start': 0, 'end': 2},
'body': 'कल',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-02T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'}],
'original_intent': {}}
In [33]: utterances = ["कल ही कर दी थी"]
....: workflow.set("output.intents", [Intent(name="pay_later", score=1.0)])
....: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
....:
In [34]: output
Out[34]:
{'intents': [{'name': 'pay_later',
'alternative_index': None,
'score': 1.0,
'parsers': [],
'slots': []}],
'entities': [{'range': {'start': 0, 'end': 2},
'body': 'कल',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2021-12-31T00:00:00.000+00:00',
'entity_type': 'date',
'grain': 'day'}],
'original_intent': {}}
Casting¶
We can also use temporal intents to cast certain entities. Machines can act on absolute time values but natural language often has relative units like “for 2 hours”. These produce a duration entity.
In [35]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [36]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["duration", "time"],
....: )
....:
In [37]: workflow = Workflow([duckling_plugin])
In [38]: utterances = ["for 2h"]
In [39]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....:
In [40]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
In [41]: output
Out[41]:
{'intents': [],
'entities': [{'range': {'start': 4, 'end': 6},
'body': '2h',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': 7200,
'unit': 'hour',
'normalized': {'value': 7200, 'unit': 'second'},
'_meta': {},
'entity_type': 'duration'}],
'original_intent': {}}
We understand the duration but we could create an absolute time value using the reference time, but we don’t know if duration should be added or removed.
In [42]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.types import Intent
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [43]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time", "duration"],
....: timezone="Asia/Kolkata",
....: temporal_intents={
....: "already_paid": DucklingPlugin.FUTURE,
....: "pay_later": DucklingPlugin.PAST
....: }
....: )
....:
In [44]: workflow = Workflow([duckling_plugin])
In [45]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....: utterances = ["for 2h"]
....:
In [46]: workflow.set("output.intents", [Intent(name="already_paid", score=1.0)])
....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
....:
In [47]: output
Out[47]:
{'intents': [{'name': 'already_paid',
'alternative_index': None,
'score': 1.0,
'parsers': [],
'slots': []}],
'entities': [{'range': {'start': 4, 'end': 6},
'body': '2h',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-01T13:37:00+05:30',
'entity_type': 'datetime',
'grain': 'hour'}],
'original_intent': {}}
In [48]: workflow.set("output.intents", [Intent(name="pay_later", score=1.0)])
....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
....:
In [49]: output
Out[49]:
{'intents': [{'name': 'pay_later',
'alternative_index': None,
'score': 1.0,
'parsers': [],
'slots': []}],
'entities': [{'range': {'start': 4, 'end': 6},
'body': '2h',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-01T09:37:00+05:30',
'entity_type': 'datetime',
'grain': 'hour'}],
'original_intent': {}}
In case we need to always cast duration as a future or a past value we can do:
In [50]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.types import Intent
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [51]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time", "duration"],
....: timezone="Asia/Kolkata",
....: temporal_intents={
....: "__any__": DucklingPlugin.FUTURE,
....: }
....: )
....:
In [52]: workflow = Workflow([duckling_plugin])
In [53]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....: utterances = ["for 2h"]
....:
In [54]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
In [55]: output
Out[55]:
{'intents': [],
'entities': [{'range': {'start': 4, 'end': 6},
'body': '2h',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-01T13:37:00+05:30',
'entity_type': 'datetime',
'grain': 'hour'}],
'original_intent': {}}
In [56]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["time", "duration"],
....: timezone="Asia/Kolkata",
....: temporal_intents={
....: "__any__": DucklingPlugin.PAST,
....: }
....: )
....:
In [57]: workflow = Workflow([duckling_plugin])
....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
....: output
....:
Out[57]:
{'intents': [],
'entities': [{'range': {'start': 4, 'end': 6},
'body': '2h',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-01-01T09:37:00+05:30',
'entity_type': 'datetime',
'grain': 'hour'}],
'original_intent': {}}
Setting Time Range Constraints¶
Sometimes there are cases where you’d expect the user to say a time between only a particular acceptable range, if the user says “कल 12:00 बजे” or “tomorrow at 12:00” at 6pm being the current time, now given our context mostly we know that user meant 12pm here next day and not 12am, so if we say our acceptable time range constraint is >= 7am but <= 11pm, we can capture the time entity correctly.
Problem¶
In [58]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [59]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["datetime"],
....: timezone="Asia/Kolkata",
....: locale="hi_IN"
....: )
....:
In [60]: workflow = Workflow([duckling_plugin])
In [61]: utterances = ["कल 12:00 बजे"]
In [62]: reference_dt = datetime(2022, 3, 16, 18, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....:
In [63]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
In [64]: output
Out[64]:
{'intents': [],
'entities': [{'range': {'start': 3, 'end': 8},
'body': '12:00',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-03-17T00:00:00.000+05:30',
'entity_type': 'datetime',
'grain': 'minute'}],
'original_intent': {}}
Given that current datetime is “2022-03-16T18:00:00+05:30” we can see duckling gives the value as “2022-03-17T00:00:00+05:30” but we wanted to capture “2022-03-17T12:00:00+05:30” so for that we can define a time range constraint.
In [65]: import pytz
....: from datetime import datetime
....: from dialogy.workflow import Workflow
....: from dialogy.base import Input
....: from dialogy.plugins import DucklingPlugin
....: from dialogy.utils import dt2timestamp
....:
In [66]: duckling_plugin = DucklingPlugin(
....: dest="output.entities",
....: dimensions=["datetime"],
....: timezone="Asia/Kolkata",
....: locale="hi_IN",
....: constraints={
....: "time": {
....: "gte": {"hour": 7, "minute": 0},
....: "lte": {"hour": 22, "minute": 59}
....: }
....: }
....: )
....:
In [67]: workflow = Workflow([duckling_plugin])
In [68]: utterances = ["कल 12:00 बजे"]
In [69]: reference_dt = datetime(2022, 3, 16, 18, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
....: reference_time = dt2timestamp(reference_dt)
....:
In [70]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
In [71]: output
Out[71]:
{'intents': [],
'entities': [{'range': {'start': 3, 'end': 8},
'body': '12:00',
'type': 'value',
'parsers': ['DucklingPlugin'],
'score': 1.0,
'alternative_index': 0,
'value': '2022-03-17T12:00:00.000+05:30',
'entity_type': 'datetime',
'grain': 'minute'}],
'original_intent': {}}
Now we can see that it gives the correct entity value i.e “2022-03-17T12:00:00+05:30”
- class DucklingPlugin(dimensions, timezone='UTC', timeout=0.5, url='http://0.0.0.0:8000/parse', locale='en_IN', constraints=None, temporal_intents=None, dest=None, guards=None, datetime_filters=None, threshold=None, activate_latent_entities=False, reference_time_column='reftime', input_column='alternatives', output_column=None, use_transform=False, debug=False)[source]¶
Bases:
dialogy.base.entity_extractor.EntityScoringMixin
,dialogy.base.plugin.Plugin
- Parameters
dimensions (
List
[str
]) – Dimensions. Of the listed
dimensions, we support:
Numeral
Time
TimeInterval
Duration
People - This isn’t part of the standard, we have a private fork to support this.
Do note, passing more dimensions is not free. Duckling would search for extra set of patterns just because those dimensions were expected.
- Parameters
locale (str) – The format for expressing locale requires language code and country name ids. Read about sections that define ISO-639-codes for languages and ISO3166 alpha2 country code for country codes. Examples: “en_IN”, “en_US”, “en_GB”.
timezone (
str
) – pytz Timezone. This is especially important when services are deployed across different
geographies and consistency is expected in the responses. Get a valid value from a list of tz database timezones. Example: “Asia/Kolkata” :type timezone: Optional[str]
- Parameters
timeout (
float
) – There are certain strings which tend to stall Duckling:
example. In such cases, to prevent the overall experience to slow down as well, provide a certain timeout value, defaults to 0.5. :type timeout: float
- Parameters
url (Optional[str]) – The address where Duckling’s entity parser can be reached, defaults to “http://0.0.0.0:8000/parse”.
- FUTURE = 'future'¶
Select time entity values with scores greater than or equal to the reference time.
- PAST = 'past'¶
Select time entity values with scores lesser than or equal to the reference time.
- apply_entity_classes(list_of_entities, reference_time=None, timezone='UTC', duration_cast_operator=None)[source]¶
- Return type
List
[BaseEntity
]
- apply_filters(entities)[source]¶
Conditionally remove entities.
The utility of this method is tracked here: https://github.com/Vernacular-ai/dialogy/issues/42
We needed a way to express, not all datetime entities are needed. There are needs which can be expressed as filters: want greater or lesser than the reference time, etc. There are more applications where we expect sorting, filtering by other attributes as well. This method is an advent of such expressions.
- Parameters
entities (List[BaseEntity]) – A list of entities.
- Returns
A list of entities obtained after applying filters.
- Return type
List[BaseEntity]
- parse(transcripts, locale=None, reference_time=None, use_latent=False, intents=None)[source]¶
- Return type
List
[BaseEntity
]
- select_datetime(entities, filter_type)[source]¶
Select datetime entities as per the filters provided in the configuration.
- Parameters
entities (List[BaseEntity]) – A list of entities.
filter_type (str) –
- Returns
List of entities obtained after applying comparator functions.
- Return type
List[BaseEntity]
- transform(training_data)[source]¶
Transform training data.
- Parameters
training_data (pd.DataFrame) – Training data.
- Returns
Transformed training data.
- Return type
pd.DataFrame
- utility(input, output)[source]¶
Produces Duckling entities, runs with a Workflow’s run method.
- Parameters
argbrightons – Expects a tuple of
Tuple[natural language for parsing entities, reference time in milliseconds, locale]
- Returns
A list of duckling entities.
- Return type
List[BaseEntity]