dialogy.plugins.text.calibration package¶

Submodules¶

dialogy.plugins.text.calibration.xgb module¶

Trains a calibraation model. This contains two models: - Vectorizer: TfIdf - Classifier: XGBoostRegressor

class CalibrationModel(threshold, dest=None, guards=None, debug=False, input_column='alternatives', output_column='alternatives', use_transform=False, model_name='calibration.pkl')[source]¶

Bases: dialogy.base.plugin.Plugin

This plugin provides a calibration model that sits between ASR and SLU. It trains a model that learn to classify alternatives from the text and AM, LM score. Bad alternatives are removed before training SLU and during inference.

filter_asr_output(utterances)[source]¶

Filters outputs from ASR based on calibration model prediction.

Parameters: asr_output – output dictionary from ASR. Should have an _alternatives_ key.
Returns: Filtered alternatives, in the same format as input.
Return type: Dict[str, Any]

inference(transcripts, utterances)[source]¶

Return type: List[str]

predict(alternatives)[source]¶

Return type: Any

save(fname)[source]¶

Return type: None

train(df)[source]¶

Trains the calibration pipeline.

Parameters

df (pd.DataFrame) – dataframe to train on. Should be a valid transcrition tagging job.
model_name (str) – Saves the pipline as {model_name}.pkl

Return type

None

transform(training_data)[source]¶

Transform data for a plugin in the workflow.

Return type: DataFrame

utility(input, _)[source]¶

An abstract method that describes the plugin’s functionality.

Parameters

input (Input) – The workflow’s input.
output (Output) – The workflow’s output.

Returns

The value returned by the plugin.

Return type

Any

validate(df)[source]¶

Return if df is a valid trascription tagging job should return False for intent tagging jobs. example : ‘{“text”: “I want to change and set my <INAUDIBLE>”, “type”: “TRANSCRIPT”}’

Sharp bits: - All rows in df should have same format. We just consider

the first row for sanity checks.

Parameters: df (pd.DataFrame) – Input dataframe.
Returns: (bool) if the dataframe is valid for training calibration model.
Return type: bool

class FeatureExtractor[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

features(alternatives)[source]¶

Return type: List[List[float]]

fit(df, y=None)[source]¶

Return type: Any

transform(df)[source]¶

Return type: Tuple[Any, Any]

dialogy.plugins.text.calibration package¶

Submodules¶

dialogy.plugins.text.calibration.xgb module¶

Module contents¶