dialogy.utils package

Submodules

dialogy.utils.datetime module

dt2timestamp(date_time)[source]

Converts a python datetime object to unix-timestamp.

Parameters

date_time (datetime) – An instance of datetime.

Returns

Unix timestamp integer.

Return type

int

is_unix_ts(ts)[source]

Check if the input is a unix timestamp.

Parameters

ts (int) – A unix timestamp (13-digit).

Returns

True if ts is a unix timestamp, else False.

Return type

bool

make_unix_ts(tz='UTC')[source]

Convert date in ISO 8601 format to unix ms timestamp.

In [1]: from dialogy.utils.datetime import make_unix_ts

In [2]: ts = make_unix_ts("Asia/Kolkata")("2022-02-07T19:39:39.537827")

In [3]: ts == 1644241599537
Out[3]: True
Parameters

tz (Optional[str], optional) – A timezone string, defaults to “UTC”

Returns

A callable that converts a date in ISO 8601 format to unix ms timestamp.

Return type

Callable[[str], int]

unix_ts_to_datetime(reference_time, timezone='UTC')[source]
Return type

datetime

dialogy.utils.file_handler module

create_timestamps_path(directory, file_name, timestamp=None, dry_run=False)[source]
Return type

str

load_file(file_path=None, mode='r', loader=None)[source]

Safely load a file.

Parameters
  • file_path ([type]) – The path to the file to load.

  • mode (str, optional) – The mode to use when opening the file., defaults to “r”

Returns

The file contents.

Return type

Any

read_from_json(params, dir_path, file_name)[source]
Return type

Dict[str, Any]

save_file(file_path=None, content=None, mode='w', encoding='utf-8', newline='\\n', writer=None)[source]

Save a file.

param file_path

The path to the file to save.

type file_path

str

param content

The content to save.

type content

Any

param mode

The mode to use when opening the file., defaults to “w”

type mode

str, optional

param encoding

The encoding to use when writing the file, defaults to “utf-8”

type encoding

str, optional

param newline

The newline character to use when writing the file, defaults to “

type newline

str, optional

Return type

None

save_to_json(params, dir_path, file_name)[source]
Return type

None

dialogy.utils.logger module

Module provides access to logger.

This needs to be used sparingly, prefer to raise specific exceptions instead.

dialogy.utils.misc module

Module provides utility functions for entities.

Import functions:
  • dict_traversal

  • validate_type

traverse_dict(obj, properties)[source]

Traverse a dictionary for a given list of properties.

This is useful for traversing a deeply nested dictionary. Instead of recursion, we are using reduce to update the dict. Missing properties will lead to KeyErrors.

In [1]: from dialogy.utils import traverse_dict

In [2]: input_ = {
   ...:     "planets": {
   ...:         "mars": [{
   ...:             "name": "",
   ...:             "languages": [{
   ...:                 "beep": {"speakers": 11},
   ...:             }, {
   ...:                 "bop": {"speakers": 30},
   ...:             }]
   ...:         }]
   ...:     }
   ...: }
   ...: 

In [3]: traverse_dict(input_, ["planets", "mars", 0 , "languages", 1, "bop"])
Out[3]: {'speakers': 30}

# element with index 3 doesn't exist!
In [4]: traverse_dict(input_, ["planets", "mars", 0 , "languages", 3, "bop"])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 traverse_dict(input_, ["planets", "mars", 0 , "languages", 3, "bop"])

File ~/Programs/pythoncode/dialogy/dialogy/utils/misc.py:52, in traverse_dict(obj, properties)
     13 """
     14 Traverse a dictionary for a given list of properties.
     15 
   (...)
     49 :raises TypeError: Properties don't describe a path due to possible type error.
     50 """
     51 try:
---> 52     return reduce(lambda o, k: o[k], properties, obj)
     53 except KeyError as key_error:
     54     raise KeyError(
     55         f"Missing property {key_error} in {obj}. Check the types. Failed for path {properties}"
     56     ) from key_error

File ~/Programs/pythoncode/dialogy/dialogy/utils/misc.py:52, in traverse_dict.<locals>.<lambda>(o, k)
     13 """
     14 Traverse a dictionary for a given list of properties.
     15 
   (...)
     49 :raises TypeError: Properties don't describe a path due to possible type error.
     50 """
     51 try:
---> 52     return reduce(lambda o, k: o[k], properties, obj)
     53 except KeyError as key_error:
     54     raise KeyError(
     55         f"Missing property {key_error} in {obj}. Check the types. Failed for path {properties}"
     56     ) from key_error

IndexError: list index out of range
Parameters
  • obj (Dict[Any, Any]) – The dict to traverse.

  • properties (List[int]) – List of properties to be parsed as a path to be navigated in the dict.

Returns

A value within a deeply nested dict.

Return type

Any

Raises
  • KeyError – Missing property in the dictionary.

  • TypeError – Properties don’t describe a path due to possible type error.

validate_type(obj, obj_type)[source]

Raise TypeError on object type mismatch.

This is syntatic sugar for instance type checks.

The check is by exclusion of types. Wraps exception raising logic.

param obj

An object available for type assertion

type obj

Any

param obj_type

This must match the type of the object.

type obj_type

(Union[type, Tuple[type]])

return

rtype

raises TypeError

If the type obj_type doesn’t match the type of obj.

Return type

None

dialogy.utils.naive_lang_detect module

lang_detect_from_text(text)[source]
Return type

str

dialogy.utils.normalize_utterance module

This module was created in response to: https://github.com/Vernacular-ai/dialogy/issues/9 we will ship functions to assist normalization of ASR output, we will refer to these as Utterances.

dict_get(prop, obj)[source]

Get value of prop within obj.

This simple function exists to facilitate a partial function defined here.

Parameters
  • prop (str) – A property within a dict.

  • obj (Dict[str, Any]) – A dict.

Returns

Value of a property within a dict.

Return type

Any

get_best_transcript(transcripts)[source]

Select the best transcript from a list of transcripts. The best transcript is the first transcript gven by ASR (20220803)

Parameters

transcripts (List[str]) – List of transcripts

Returns

A string containing the best transcript

Return type

str

is_each_element(type_, input_, transform=<function <lambda>>)[source]

Check if each element in a list is of a given type.

Parameters
  • type (Type) – Expected Type of each element in the input_ which is a list.

  • input (List[Any]) – A list.

  • transform (Callable[[Any], Any]) – We may apply some transforms to each element before making these checks. This is to check if a certain key in a Dict matches the expected type. In case this is not needed, leave the argument unset and an identity transform is applied. Defaults to lambda x:x.

Returns

Checks each element in a list to match type_, if any element fails the check, this returns False, else True.

Return type

bool

is_list(input_)[source]

Check type of input

Parameters

input (Any) – Any arbitrary input

Returns

True if input is a list else False

Return type

True

is_list_of_string(maybe_utterance)[source]

Check input to be of List[str].

In [1]: from dialogy.utils.normalize_utterance import is_list_of_string

In [2]: is_list_of_string(["this", "works"])
Out[2]: True
Parameters

maybe_utterance (Any) – Arbitrary input.

Returns

True if maybe_utterance is a str.

Return type

bool

is_string(maybe_utterance)[source]

Check input’s type is str.

Parameters

maybe_utterance (Any) – Arbitrary type input.

Returns

True if maybe_utterance is a str, else False.

Return type

bool

is_unsqueezed_utterance(maybe_utterance, key='transcript')[source]

Check input to be of List[Dict].

In [1]: from dialogy.utils.normalize_utterance import is_unsqueezed_utterance

# 1. This fails
In [2]: is_unsqueezed_utterance([[{"transcript": "this"}, {"transcript": "works"}]])
Out[2]: False

# 2. key is configurable
In [3]: is_unsqueezed_utterance([{"text": "this"}, {"text": "works"}], key="text")
Out[3]: True
Parameters
  • maybe_utterance (Any) – Arbitrary type input.

  • key (str, Defaults to const.TRANSCRIPT.) – The key within which transcription string resides.

Returns

True, if the input is of type List[Dict[str, Any]] else False.

Return type

bool

is_utterance(maybe_utterance, key='transcript')[source]

Check input to be of List[List[Dict]].

In [1]: from dialogy.utils.normalize_utterance import is_utterance

# 1. :code:`List[List[Dict[str, str]]]`
In [2]: is_utterance([[{"transcript": "this"}, {"transcript": "works"}]])
Out[2]: True

# 2. key is configurable
In [3]: is_utterance([[{"text": "this"}, {"text": "works"}]], key="text")
Out[3]: True

# 3. Hope for everything else... you have a mastercard.
# Or use this lib, works just fine 🍷.
In [4]: is_utterance([{"transcript": "this"}, {"transcript": "doesn't"}, {"transcript": "work"}])
Out[4]: False
Parameters
  • maybe_utterance (Any) – Arbitrary input.

  • key (str) – The key within which transcription string resides. Defaults to const.TRANSCRIPT.

Returns

True if the inputs is List[List[Dict[str, str]]], else False.

Return type

bool

normalize(maybe_utterance, key='transcript')[source]

Adapt various non-standard ASR alternative forms.

The output will be a list of strings since models will expect that.

In [1]: In [1]: from dialogy.utils.normalize_utterance import normalize
   ...: In [2]: # A popular case
   ...: In [3]: normalize([[{"transcript": "this"}, {"transcript": "works"}]])
   ...: 
Out[1]: ['this', 'works']
In [2]: In [3]: # A case with multiple utterances
   ...: In [4]: normalize([
   ...:             [{"transcript": "hello hello?", "transcript": "yellow yellow?"}],
   ...:             [{"transcript": "I wanted to check"}],
   ...:             [{"transcript": "if you have space for us?"}]
   ...:         ])
   ...: 
Out[2]: ['yellow yellow? I wanted to check if you have space for us?']
In [3]: In [5]: normalize([{"transcript": "I wanted to know umm hello?"}])
Out[3]: ['I wanted to know umm hello?']
In [4]: In [6]: normalize(["I wanted to know umm hello?"])
Out[4]: ['I wanted to know umm hello?']
In [5]: In [7]: normalize("I wanted to know umm hello?")
Out[5]: ['I wanted to know umm hello?']
Parameters
  • maybe_utterance (Any) – Arbitrary input.

  • key (str) – A string to be looked into List[List[Dict[str, str]]], List[Dict[str, str]] type inputs.

Returns

A flattened list of strings parsed from various formats.

Return type

List[str]

Raises

TypeError: If maybe_utterance is none of the expected types.

utterance2alternatives(utterances, key='transcript')[source]

Convert a list of utterances to a list of alternatives.

Return type

List[str]

dialogy.utils.temperature_scaling module

T_scaling(logits, temperature)[source]
Return type

Tensor

calc_bins(preds, labels_oneh)[source]
Return type

Any

fit_ts_parameter(logits_list, labels_list, lr=0.001, max_iter=10000, device=device(type='cuda'))[source]
Return type

float

get_metrics(preds, labels_oneh)[source]
Return type

Tuple[float, float]

save_reliability_graph(preds, labels_oneh, dir_path, prefix)[source]
Return type

None

Module contents