dialogy.plugins.text.list_entity_plugin package

Module contents

Spacy NER

We also allow using spacy’s NER. This has to be passed within the spacy_nlp attribute.

In [1]: import spacy
   ...: from dialogy.workflow import Workflow
   ...: from dialogy.plugins import ListEntityPlugin
   ...: from dialogy.base import Input
   ...: 

In [2]: nlp = spacy.load("en_core_web_sm")

In [3]: list_entity_plugin = ListEntityPlugin(
   ...:   spacy_nlp=nlp,
   ...:   style="spacy",
   ...:   dest="output.entities"
   ...: )
   ...: workflow = Workflow([list_entity_plugin])
   ...: _, output = workflow.run(Input(utterances="Need a place to stay in New Delhi."))
   ...: 

In [4]: output
Out[4]: 
{'intents': [],
 'entities': [{'range': {'start': 24, 'end': 33},
   'body': 'New Delhi',
   'type': 'GPE',
   'parsers': ['ListEntityPlugin'],
   'score': 1.0,
   'alternative_index': 0,
   'value': 'New Delhi',
   'entity_type': 'GPE',
   '_meta': {}}],
 'original_intent': {}}
class ListEntityPlugin(style=None, candidates=None, spacy_nlp=None, dest=None, guards=None, labels=None, threshold=None, input_column='alternatives', output_column=None, use_transform=True, flags=RegexFlag.None, debug=False)[source]

Bases: dialogy.base.entity_extractor.EntityScoringMixin, dialogy.base.plugin.Plugin

A Plugin for extracting entities using spacy or a list of regex patterns.

Parameters
  • style (Optional[str]) – One of [“regex”, “spacy”]

  • candidates (Optional[Dict[str, List[str]]]) – Required if style is “regex”, this is a dict that shows a mapping of entity values and their patterns.

  • spacy_nlp (Any) – Required if style is “spacy”, requires is a spacy model.

  • labels (Optional[List[str]]) – Required if style is “spacy”. If there is a need to extract only a few labels from all the other available labels.

  • debug (bool) – A flag to set debugging on the plugin methods

get_entities(transcripts)[source]

Parse entities using regex and spacy ner.

Parameters

transcripts (List[str]) – A list of strings within which to search for entities.

Returns

List of entities from regex matches or spacy ner.

Return type

List[KeywordEntity]

Wrapper over spacy’s ner search.

Parameters

transcript (str) – A string to search entities within.

Returns

NER parsing via spacy.

Return type

MatchType

Wrapper over regex searches.

Parameters

transcript (str) – A string to search entities within.

Returns

regex parsing via spacy.

Return type

MatchType

transform(training_data)[source]

Transform training data.

Parameters

training_data (pd.DataFrame) – Training data.

Returns

Transformed training data.

Return type

pd.DataFrame

utility(input, _)[source]

An abstract method that describes the plugin’s functionality.

Parameters
  • input (Input) – The workflow’s input.

  • output (Output) – The workflow’s output.

Returns

The value returned by the plugin.

Return type

Any