dialogy.plugins.text.canonicalization package

Module contents

class CanonicalizationPlugin(serializer=<function get_entity_type>, mask='MASK', mask_tokens=None, dest=None, guards=None, entity_column='entities', input_column='alternatives', output_column=None, use_transform=False, threshold=0.0, debug=False)[source]

Bases: dialogy.base.plugin.Plugin

This plugin implements the canonicalization of the text.

mask_transcript(entities, transcripts)[source]
Return type

List[str]

transform(training_data)[source]

Transform data for a plugin in the workflow.

Return type

DataFrame

utility(input, output)[source]

An abstract method that describes the plugin’s functionality.

Parameters
  • input (Input) – The workflow’s input.

  • output (Output) – The workflow’s output.

Returns

The value returned by the plugin.

Return type

Any

get_entity_type(entity)[source]
Return type

str