dialogy.plugins.text.duckling_plugin package

Module contents

We use Duckling for parsing values like: date, time, numbers, currency etc from natural language. The parser will expect Duckling to be running as an http service, and provide means to connect from the implementation here. Here’s a big example for various duckling entities.

Mother of all examples

In [1]: from dialogy.workflow import Workflow
   ...: from dialogy.base import Input
   ...: from dialogy.plugins import DucklingPlugin
   ...: 

In [2]: duckling_plugin = DucklingPlugin(
   ...:     dest="output.entities",
   ...:     dimensions=[
   ...:         "number",
   ...:         "people",
   ...:         "time",
   ...:         "duration",
   ...:         "intersect",
   ...:         "amount-of-money",
   ...:         "credit-card-number",
   ...:     ],
   ...:     # Duckling supports multiple dimensions, by specifying a list, we make sure
   ...:     # we are searching for a match within the expected dimensions only.
   ...: )
   ...: 

In [3]: workflow = Workflow([duckling_plugin])

In [4]: utterances = [
   ...:    "between today 7pm and tomorrow 9pm",
   ...:    "can we get up at 6 am",
   ...:    "we are 7 people",
   ...:    "2 children 1 man 3 girls",
   ...:    "can I come now?",
   ...:    "can I come tomorrow",
   ...:    "how about monday then",
   ...:    "call me on 5th march",
   ...:    "I want 4 pizzas",
   ...:    "2 hours",
   ...:    "can I pay $5 instead?",
   ...:    "my credit card number is 4111111111111111",
   ...: ]
   ...: 

In [5]: %%time
   ...: input_, output = workflow.run(Input(utterances=utterances))
   ...: 
CPU times: user 17.4 ms, sys: 2.84 ms, total: 20.3 ms
Wall time: 21.9 ms

In [6]: output
Out[6]: 
{'intents': [],
 'entities': [{'range': {'start': 0, 'end': 34},
   'body': 'between today 7pm and tomorrow 9pm',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 0,
   'entity_type': 'datetime',
   'grain': 'hour',
   'type': 'interval',
   'from_value': '2022-08-24T19:00:00.000+00:00',
   'to_value': '2022-08-25T22:00:00.000+00:00',
   'value': {'from': '2022-08-24T19:00:00.000+00:00',
    'to': '2022-08-25T22:00:00.000+00:00'}},
  {'range': {'start': 14, 'end': 21},
   'body': 'at 6 am',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 1,
   'value': '2022-08-25T06:00:00.000+00:00',
   'entity_type': 'time',
   'grain': 'hour'},
  {'range': {'start': 7, 'end': 15},
   'body': '7 people',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 2,
   'value': 7,
   'unit': 'person',
   'entity_type': 'people'},
  {'range': {'start': 0, 'end': 10},
   'body': '2 children',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 3,
   'value': 2,
   'unit': 'child',
   'entity_type': 'people'},
  {'range': {'start': 11, 'end': 16},
   'body': '1 man',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 3,
   'value': 1,
   'unit': 'male',
   'entity_type': 'people'},
  {'range': {'start': 17, 'end': 24},
   'body': '3 girls',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 3,
   'value': 3,
   'unit': 'female',
   'entity_type': 'people'},
  {'range': {'start': 11, 'end': 14},
   'body': 'now',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 4,
   'value': '2022-08-24T18:28:17.690+00:00',
   'entity_type': 'datetime',
   'grain': 'second'},
  {'range': {'start': 11, 'end': 19},
   'body': 'tomorrow',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 5,
   'value': '2022-08-25T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'},
  {'range': {'start': 4, 'end': 16},
   'body': 'about monday',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 6,
   'value': '2022-08-29T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'},
  {'range': {'start': 8, 'end': 20},
   'body': 'on 5th march',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 7,
   'value': '2023-03-05T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'},
  {'range': {'start': 7, 'end': 8},
   'body': '4',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 8,
   'value': 4,
   'entity_type': 'number'},
  {'range': {'start': 0, 'end': 7},
   'body': '2 hours',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 9,
   'value': 7200,
   'unit': 'hour',
   'normalized': {'value': 7200, 'unit': 'second'},
   '_meta': {},
   'entity_type': 'duration'},
  {'range': {'start': 10, 'end': 12},
   'body': '$5',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 10,
   'value': 5,
   'unit': '$',
   'entity_type': 'amount-of-money'},
  {'range': {'start': 25, 'end': 41},
   'body': '4111111111111111',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 11,
   'entity_type': 'credit-card-number',
   'issuer': 'visa',
   'value': '4111111111111111'},
  {'range': {'start': 25, 'end': 41},
   'body': '4111111111111111',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 0.08333333333333333,
   'alternative_index': 11,
   'value': 4111111111111111,
   'entity_type': 'number'}],
 'original_intent': {}}

Testing

  1. Connect to the Duckling API either via a docker container or a local setup.

  2. Boot an IPython session and setup an instance of DucklingPlugin.

 1from pprint import pprint
 2from dialogy.workflow import Workflow
 3from dialogy.base import Input
 4from dialogy.plugins import DucklingPlugin
 5
 6duckling_plugin = DucklingPlugin(
 7    dest="output.entities",
 8    dimensions=["people", "time"],
 9    locale="en_IN",
10    timezone="Asia/Kolkata",
11    origin/master
12)
13
14entities = duckling_plugin.parse(["We are 2 children coming tomorrow at 5 pm"])
15pprint(entities)
16# [PeopleEntity(
17#     body='2 children', 
18#     type='value', 
19#     parsers=['DucklingPlugin'], 
20#     score=1.0, 
21#     alternative_index=0,
22#     alternative_indices=[0, 0], 
23#     latent=False, 
24#     value=2, 
25#     origin='value', 
26#     unit='child',
27#     entity_type='people')
28# ,TimeEntity(
29#     body='tomorrow at 5 pm', 
30#     type='value', 
31#     parsers=['DucklingPlugin'], 
32#     score=1.0, 
33#     alternative_index=0, 
34#     alternative_indices=[0],
35#     latent=False, 
36#     value='2022-02-11T17:00:00.000+05:30', origin='value', entity_type='datetime', grain='hour')]
37
38# To convert these to dicts:
39
40entity_dicts = [entity.json() for entity in entities]
41pprint(entity_dicts)
42# [{'range': {'start': 7, 'end': 17},
43#     'body': '2 children',
44#     'type': 'value',
45#     'parsers': ['DucklingPlugin'],
46#     'score': 1.0,
47#     'alternative_index': 0,
48#     'value': 2,
49#     'unit': 'child',
50#     'entity_type': 'people'},
51# {'range': {'start': 38, 'end': 54},
52#     'body': 'tomorrow at 5 pm',
53#     'type': 'value',
54#     'parsers': ['DucklingPlugin'],
55#     'score': 1.0,
56#     'alternative_index': 0,
57#     'value': '2022-02-11T17:00:00.000+05:30',
58#     'entity_type': 'datetime',
59#     'grain': 'hour'
60# }]

Guards

Like all plugins, the Duckling plugin can be guarded by conditions that prevent its execution. You may need this if you want to save on latency and are sure of not expecting entities in those turns. We will use the mother of all examples from above to make the point. We will use the current_state to tell the plugin to guard itself in the “SMALL_TALK” state.

In [7]: from dialogy.workflow import Workflow
   ...: from dialogy.base import Input
   ...: from dialogy.plugins import DucklingPlugin
   ...: 

In [8]: duckling_plugin = DucklingPlugin(
   ...:     dest="output.entities",
   ...:     dimensions=[
   ...:         "number",
   ...:         "people",
   ...:         "date",
   ...:         "time",
   ...:         "duration",
   ...:         "intersect",
   ...:         "amount-of-money",
   ...:         "credit-card-number",
   ...:     ],
   ...:     guards=[lambda i, o: i.current_state == "SMALL_TALK"]
   ...: )
   ...: 

In [9]: workflow = Workflow([duckling_plugin])

In [10]: utterances = [
   ....:    "between today 7pm and tomorrow 9pm",
   ....:    "can we get up at 6 am",
   ....:    "we are 7 people",
   ....:    "2 children 1 man 3 girls",
   ....:    "can I come now?",
   ....:    "can I come tomorrow",
   ....:    "how about monday then",
   ....:    "call me on 5th march",
   ....:    "I want 4 pizzas",
   ....:    "2 hours",
   ....:    "can I pay $5 instead?",
   ....:    "my credit card number is 4111111111111111",
   ....: ]
   ....: 

In [11]: %%time
   ....: input_, output = workflow.run(Input(utterances=utterances, current_state="SMALL_TALK"))
   ....: 
CPU times: user 123 us, sys: 0 ns, total: 123 us
Wall time: 126 us

In [12]: output
Out[12]: {'intents': [], 'entities': [], 'original_intent': {}}

You can notice that the time drops from ms to us and we produce no entities.

Filtering Date and Time

There are cases where we run into ambiguity in entity resolution. A fun case happens usually with the Hindi language when someone says “कल”. The problem is, “कल” could be “tommorrow” or “yesterday”.

Problem

In [13]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [14]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time"],
   ....: )
   ....: 

In [15]: workflow = Workflow([duckling_plugin])

In [16]: utterances = ["कल"]

In [17]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: 

In [18]: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))

In [19]: output
Out[19]: 
{'intents': [],
 'entities': [{'range': {'start': 0, 'end': 2},
   'body': 'कल',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2021-12-31T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'},
  {'range': {'start': 0, 'end': 2},
   'body': 'कल',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-02T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'}],
 'original_intent': {}}

See the two entities?

  • We froze our reference time for 1st Jan 2022 at 12:00:00.

  • We get 31st December 2021

  • and 2nd Jan 2022.

This could be decoded using additional context. Let’s assume it is a given that dates can only be in future. Say a flight-booking case.

In [20]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [21]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time"],
   ....:     datetime_filters=DucklingPlugin.FUTURE
   ....: )
   ....: 

In [22]: workflow = Workflow([duckling_plugin])

In [23]: utterances = ["कल"]

In [24]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: 

In [25]: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))

In [26]: output
Out[26]: 
{'intents': [],
 'entities': [{'range': {'start': 0, 'end': 2},
   'body': 'कल',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-02T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'}],
 'original_intent': {}}

We can use DucklingPlugin.PAST for inverse.

What if the context was within the utterance? Say for payment verification cases, we receive these utterances:

  • “कल ही कर दी थी”“I have already paid yesterday”

  • “कल कर दूंगा”“I will pay tomorrow”

We can’t filter out either future or past values. In these cases, the intent <Intents> carries the context for resolving the entity value. We will look into Temporal Intents for more details.

Temporal Intents

Filtering

There are cases where intents carry a notion of time, repeating an example from above:

  1. “कल ही कर दी थी”“I have already paid yesterday”

  2. “कल कर दूंगा”“I will pay tomorrow”

While we can’t decode the entity value for “कल” in the above utterances, the correct intent prediction is:

  1. already_paid

  2. pay_later

If we could use this information, we could resolve the entity value for “कल”. We will simulate the output to contain appropriate intents using the set method of workflow.

In [27]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.types import Intent
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [28]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time"],
   ....:     temporal_intents={
   ....:         "already_paid": DucklingPlugin.FUTURE,
   ....:         "pay_later": DucklingPlugin.PAST
   ....:     }
   ....: )
   ....: 

In [29]: workflow = Workflow([duckling_plugin])

In [30]: utterances = ["कल ही कर दी थी"]
   ....: workflow.set("output.intents", [Intent(name="already_paid", score=1.0)])
   ....: 
Out[30]: Workflow(plugins=[<dialogy.plugins.text.duckling_plugin.DucklingPlugin object at 0x7fbaf91780d0>], input=None, output=Output(intents=[Intent(name='already_paid', score=1.0, alternative_index=None)], entities=[], original_intent={}), debug=False)

In [31]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
   ....: 

In [32]: output
Out[32]: 
{'intents': [{'name': 'already_paid',
   'alternative_index': None,
   'score': 1.0,
   'parsers': [],
   'slots': []}],
 'entities': [{'range': {'start': 0, 'end': 2},
   'body': 'कल',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-02T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'}],
 'original_intent': {}}

In [33]: utterances = ["कल ही कर दी थी"]
   ....: workflow.set("output.intents", [Intent(name="pay_later", score=1.0)])
   ....: input_, output = workflow.run(Input(utterances=utterances, locale="hi_IN", reference_time=reference_time))
   ....: 

In [34]: output
Out[34]: 
{'intents': [{'name': 'pay_later',
   'alternative_index': None,
   'score': 1.0,
   'parsers': [],
   'slots': []}],
 'entities': [{'range': {'start': 0, 'end': 2},
   'body': 'कल',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2021-12-31T00:00:00.000+00:00',
   'entity_type': 'date',
   'grain': 'day'}],
 'original_intent': {}}

Casting

We can also use temporal intents to cast certain entities. Machines can act on absolute time values but natural language often has relative units like “for 2 hours”. These produce a duration entity.

In [35]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [36]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["duration", "time"],
   ....: )
   ....: 

In [37]: workflow = Workflow([duckling_plugin])

In [38]: utterances = ["for 2h"]

In [39]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: 

In [40]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))

In [41]: output
Out[41]: 
{'intents': [],
 'entities': [{'range': {'start': 4, 'end': 6},
   'body': '2h',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': 7200,
   'unit': 'hour',
   'normalized': {'value': 7200, 'unit': 'second'},
   '_meta': {},
   'entity_type': 'duration'}],
 'original_intent': {}}

We understand the duration but we could create an absolute time value using the reference time, but we don’t know if duration should be added or removed.

In [42]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.types import Intent
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [43]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time", "duration"],
   ....:     timezone="Asia/Kolkata",
   ....:     temporal_intents={
   ....:         "already_paid": DucklingPlugin.FUTURE,
   ....:         "pay_later": DucklingPlugin.PAST
   ....:     }
   ....: )
   ....: 

In [44]: workflow = Workflow([duckling_plugin])

In [45]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: utterances = ["for 2h"]
   ....: 

In [46]: workflow.set("output.intents", [Intent(name="already_paid", score=1.0)])
   ....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
   ....: 

In [47]: output
Out[47]: 
{'intents': [{'name': 'already_paid',
   'alternative_index': None,
   'score': 1.0,
   'parsers': [],
   'slots': []}],
 'entities': [{'range': {'start': 4, 'end': 6},
   'body': '2h',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-01T13:37:00+05:30',
   'entity_type': 'datetime',
   'grain': 'hour'}],
 'original_intent': {}}

In [48]: workflow.set("output.intents", [Intent(name="pay_later", score=1.0)])
   ....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
   ....: 

In [49]: output
Out[49]: 
{'intents': [{'name': 'pay_later',
   'alternative_index': None,
   'score': 1.0,
   'parsers': [],
   'slots': []}],
 'entities': [{'range': {'start': 4, 'end': 6},
   'body': '2h',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-01T09:37:00+05:30',
   'entity_type': 'datetime',
   'grain': 'hour'}],
 'original_intent': {}}

In case we need to always cast duration as a future or a past value we can do:

In [50]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.types import Intent
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [51]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time", "duration"],
   ....:     timezone="Asia/Kolkata",
   ....:     temporal_intents={
   ....:         "__any__": DucklingPlugin.FUTURE,
   ....:     }
   ....: )
   ....: 

In [52]: workflow = Workflow([duckling_plugin])

In [53]: reference_dt = datetime(2022, 1, 1, 12, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: utterances = ["for 2h"]
   ....: 

In [54]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))

In [55]: output
Out[55]: 
{'intents': [],
 'entities': [{'range': {'start': 4, 'end': 6},
   'body': '2h',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-01T13:37:00+05:30',
   'entity_type': 'datetime',
   'grain': 'hour'}],
 'original_intent': {}}

In [56]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["time", "duration"],
   ....:     timezone="Asia/Kolkata",
   ....:     temporal_intents={
   ....:         "__any__": DucklingPlugin.PAST,
   ....:     }
   ....: )
   ....: 

In [57]: workflow = Workflow([duckling_plugin])
   ....: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))
   ....: output
   ....: 
Out[57]: 
{'intents': [],
 'entities': [{'range': {'start': 4, 'end': 6},
   'body': '2h',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-01-01T09:37:00+05:30',
   'entity_type': 'datetime',
   'grain': 'hour'}],
 'original_intent': {}}

Setting Time Range Constraints

Sometimes there are cases where you’d expect the user to say a time between only a particular acceptable range, if the user says “कल 12:00 बजे” or “tomorrow at 12:00” at 6pm being the current time, now given our context mostly we know that user meant 12pm here next day and not 12am, so if we say our acceptable time range constraint is >= 7am but <= 11pm, we can capture the time entity correctly.

Problem

In [58]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [59]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["datetime"],
   ....:     timezone="Asia/Kolkata",
   ....:     locale="hi_IN"
   ....: )
   ....: 

In [60]: workflow = Workflow([duckling_plugin])

In [61]: utterances = ["कल 12:00 बजे"]

In [62]: reference_dt = datetime(2022, 3, 16, 18, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: 

In [63]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))

In [64]: output
Out[64]: 
{'intents': [],
 'entities': [{'range': {'start': 3, 'end': 8},
   'body': '12:00',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-03-17T00:00:00.000+05:30',
   'entity_type': 'datetime',
   'grain': 'minute'}],
 'original_intent': {}}

Given that current datetime is “2022-03-16T18:00:00+05:30” we can see duckling gives the value as “2022-03-17T00:00:00+05:30” but we wanted to capture “2022-03-17T12:00:00+05:30” so for that we can define a time range constraint.

In [65]: import pytz
   ....: from datetime import datetime
   ....: from dialogy.workflow import Workflow
   ....: from dialogy.base import Input
   ....: from dialogy.plugins import DucklingPlugin
   ....: from dialogy.utils import dt2timestamp
   ....: 

In [66]: duckling_plugin = DucklingPlugin(
   ....:     dest="output.entities",
   ....:     dimensions=["datetime"],
   ....:     timezone="Asia/Kolkata",
   ....:     locale="hi_IN",
   ....:     constraints={
   ....:         "time": {
   ....:             "gte": {"hour": 7, "minute": 0},
   ....:             "lte": {"hour": 22, "minute": 59}
   ....:         }
   ....:     }
   ....: )
   ....: 

In [67]: workflow = Workflow([duckling_plugin])

In [68]: utterances = ["कल 12:00 बजे"]

In [69]: reference_dt = datetime(2022, 3, 16, 18, 0, 0, tzinfo=pytz.timezone("Asia/Kolkata"))
   ....: reference_time = dt2timestamp(reference_dt)
   ....: 

In [70]: input_, output = workflow.run(Input(utterances=utterances, reference_time=reference_time))

In [71]: output
Out[71]: 
{'intents': [],
 'entities': [{'range': {'start': 3, 'end': 8},
   'body': '12:00',
   'type': 'value',
   'parsers': ['DucklingPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': '2022-03-17T12:00:00.000+05:30',
   'entity_type': 'datetime',
   'grain': 'minute'}],
 'original_intent': {}}

Now we can see that it gives the correct entity value i.e “2022-03-17T12:00:00+05:30”

class DucklingPlugin(dimensions, timezone='UTC', timeout=0.5, url='http://0.0.0.0:8000/parse', locale='en_IN', constraints=None, temporal_intents=None, dest=None, guards=None, datetime_filters=None, threshold=None, activate_latent_entities=False, reference_time_column='reftime', input_column='alternatives', output_column=None, use_transform=False, debug=False)[source]

Bases: dialogy.base.entity_extractor.EntityScoringMixin, dialogy.base.plugin.Plugin

Parameters

dimensions (List[str]) – Dimensions. Of the listed

dimensions, we support:

  • Numeral

  • Time

  • TimeInterval

  • Duration

  • People - This isn’t part of the standard, we have a private fork to support this.

Do note, passing more dimensions is not free. Duckling would search for extra set of patterns just because those dimensions were expected.

Parameters
  • locale (str) – The format for expressing locale requires language code and country name ids. Read about sections that define ISO-639-codes for languages and ISO3166 alpha2 country code for country codes. Examples: “en_IN”, “en_US”, “en_GB”.

  • timezone (str) – pytz Timezone. This is especially important when services are deployed across different

geographies and consistency is expected in the responses. Get a valid value from a list of tz database timezones. Example: “Asia/Kolkata” :type timezone: Optional[str]

Parameters

timeout (float) – There are certain strings which tend to stall Duckling:

example. In such cases, to prevent the overall experience to slow down as well, provide a certain timeout value, defaults to 0.5. :type timeout: float

Parameters

url (Optional[str]) – The address where Duckling’s entity parser can be reached, defaults to “http://0.0.0.0:8000/parse”.

FUTURE = 'future'

Select time entity values with scores greater than or equal to the reference time.

PAST = 'past'

Select time entity values with scores lesser than or equal to the reference time.

apply_entity_classes(list_of_entities, reference_time=None, timezone='UTC', duration_cast_operator=None)[source]
Return type

List[BaseEntity]

apply_filters(entities)[source]

Conditionally remove entities.

The utility of this method is tracked here: https://github.com/Vernacular-ai/dialogy/issues/42

We needed a way to express, not all datetime entities are needed. There are needs which can be expressed as filters: want greater or lesser than the reference time, etc. There are more applications where we expect sorting, filtering by other attributes as well. This method is an advent of such expressions.

Parameters

entities (List[BaseEntity]) – A list of entities.

Returns

A list of entities obtained after applying filters.

Return type

List[BaseEntity]

get_operator(filter_type)[source]
Return type

Any

parse(transcripts, locale=None, reference_time=None, use_latent=False, intents=None)[source]
Return type

List[BaseEntity]

select_datetime(entities, filter_type)[source]

Select datetime entities as per the filters provided in the configuration.

Parameters
  • entities (List[BaseEntity]) – A list of entities.

  • filter_type (str) –

Returns

List of entities obtained after applying comparator functions.

Return type

List[BaseEntity]

transform(training_data)[source]

Transform training data.

Parameters

training_data (pd.DataFrame) – Training data.

Returns

Transformed training data.

Return type

pd.DataFrame

utility(input, output)[source]

Produces Duckling entities, runs with a Workflow’s run method.

Parameters

argbrightons – Expects a tuple of Tuple[natural language for parsing entities, reference time in milliseconds, locale]

Returns

A list of duckling entities.

Return type

List[BaseEntity]

validate(input_, reference_time)[source]
Return type

DucklingPlugin