dialogy.plugins.text.list_search_plugin package¶
Module contents¶
Module needs refactor. We are currently keeping all strategies bundled as methods as opposed to SearchStrategyClasses.
Within dialogy, we extract entities using Duckling, Pattern lists and Spacy. We can ship individual plugins but at the
same time, the difference is just configuration of each of these tools/services. There is another difference of
intermediate structure that the DucklingPlugin expects. We need to prevent the impact of the structure from affecting
all other entities. So that their from_dict(...)
methods are pristine and involve no shape hacking.
- class ListSearchPlugin(fuzzy_dp_config, threshold=None, dest=None, guards=None, input_column='alternatives', output_column=None, use_transform=True, flags=RegexFlag.None, debug=False, fuzzy_threshold=0.1)[source]¶
Bases:
dialogy.base.entity_extractor.EntityScoringMixin
,dialogy.base.plugin.Plugin
A Plugin for extracting entities using spacy or a list of regex patterns.
- Parameters
style (Optional[str]) – One of [“regex”, “spacy”]
candidates (Optional[Dict[str, List[str]]]) – Required if style is “regex”, this is a
dict
that shows a mapping of entity values and their patterns.spacy_nlp (Any) – Required if style is “spacy”, this is a spacy model.
labels (Optional[List[str]]) – Required if style is “spacy”. If there is a need to extract only a few labels from all the other available labels.
debug (bool) – A flag to set debugging on the plugin methods
- dp_search(query, nlp, entity_type='', entity_patterns=None, match_dict=None)[source]¶
- Return type
Tuple
[str
,str
,str
,Tuple
[int
,int
],float
]
- fuzzy_init()[source]¶
Initializing the parameters for fuzzy dp search with their values
- Return type
None
- get_entities(transcripts, lang)[source]¶
Parse entities using regex and spacy ner.
- Parameters
transcripts (List[str]) – A list of strings within which to search for entities.
- Returns
List of entities from regex matches or spacy ner.
- Return type
List[KeywordEntity]
- get_fuzzy_dp_search(transcript, lang='')[source]¶
Search for Entity in transcript from a defined List Search space :param transcripts : A list of transcripts,
List[str]
. :param lang : Language code of the transcript :code str :return: Token matches with the transcript. :rtype: List[MatchType]