dialogy.utils package¶

Submodules¶

dialogy.utils.datetime module¶

dt2timestamp(date_time)[source]¶

Converts a python datetime object to unix-timestamp.

Parameters: date_time (datetime) – An instance of datetime.
Returns: Unix timestamp integer.
Return type: int

is_unix_ts(ts)[source]¶

Check if the input is a unix timestamp.

Parameters: ts (int) – A unix timestamp (13-digit).
Returns: True if ts is a unix timestamp, else False.
Return type: bool

make_unix_ts(tz='UTC')[source]¶

Convert date in ISO 8601 format to unix ms timestamp.

In [1]: from dialogy.utils.datetime import make_unix_ts

In [2]: ts = make_unix_ts("Asia/Kolkata")("2022-02-07T19:39:39.537827")

In [3]: ts == 1644241599537
Out[3]: True

Parameters: tz (Optional[str], optional) – A timezone string, defaults to “UTC”
Returns: A callable that converts a date in ISO 8601 format to unix ms timestamp.
Return type: Callable[[str], int]

unix_ts_to_datetime(reference_time, timezone='UTC')[source]¶

Return type: datetime

dialogy.utils.file_handler module¶

create_timestamps_path(directory, file_name, timestamp=None, dry_run=False)[source]¶

Return type: str

load_file(file_path=None, mode='r', loader=None)[source]¶

Safely load a file.

Parameters

file_path ([type]) – The path to the file to load.
mode (str, optional) – The mode to use when opening the file., defaults to “r”

Returns

The file contents.

Return type

Any

read_from_json(params, dir_path, file_name)[source]¶

Return type: Dict[str, Any]

save_file(file_path=None, content=None, mode='w', encoding='utf-8', newline='\\n', writer=None)[source]¶

Save a file.

param file_path

The path to the file to save.

type file_path

str

param content

The content to save.

type content

Any

param mode

The mode to use when opening the file., defaults to “w”

type mode

str, optional

param encoding

The encoding to use when writing the file, defaults to “utf-8”

type encoding

str, optional

param newline

The newline character to use when writing the file, defaults to “

“

type newline: str, optional

Return type: None

save_to_json(params, dir_path, file_name)[source]¶

Return type: None

dialogy.utils.logger module¶

Module provides access to logger.

This needs to be used sparingly, prefer to raise specific exceptions instead.

dialogy.utils.misc module¶

Module provides utility functions for entities.

Import functions:

dict_traversal
validate_type

traverse_dict(obj, properties)[source]¶

Traverse a dictionary for a given list of properties.

This is useful for traversing a deeply nested dictionary. Instead of recursion, we are using reduce to update the dict. Missing properties will lead to KeyErrors.

In [1]: from dialogy.utils import traverse_dict

In [2]: input_ = {
   ...:     "planets": {
   ...:         "mars": [{
   ...:             "name": "",
   ...:             "languages": [{
   ...:                 "beep": {"speakers": 11},
   ...:             }, {
   ...:                 "bop": {"speakers": 30},
   ...:             }]
   ...:         }]
   ...:     }
   ...: }
   ...: 

In [3]: traverse_dict(input_, ["planets", "mars", 0 , "languages", 1, "bop"])
Out[3]: {'speakers': 30}

# element with index 3 doesn't exist!
In [4]: traverse_dict(input_, ["planets", "mars", 0 , "languages", 3, "bop"])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 traverse_dict(input_, ["planets", "mars", 0 , "languages", 3, "bop"])

File ~/Programs/pythoncode/dialogy/dialogy/utils/misc.py:52, in traverse_dict(obj, properties)
     13 """
     14 Traverse a dictionary for a given list of properties.
     15 
   (...)
     49 :raises TypeError: Properties don't describe a path due to possible type error.
     50 """
     51 try:
---> 52     return reduce(lambda o, k: o[k], properties, obj)
     53 except KeyError as key_error:
     54     raise KeyError(
     55         f"Missing property {key_error} in {obj}. Check the types. Failed for path {properties}"
     56     ) from key_error

File ~/Programs/pythoncode/dialogy/dialogy/utils/misc.py:52, in traverse_dict.<locals>.<lambda>(o, k)
     13 """
     14 Traverse a dictionary for a given list of properties.
     15 
   (...)
     49 :raises TypeError: Properties don't describe a path due to possible type error.
     50 """
     51 try:
---> 52     return reduce(lambda o, k: o[k], properties, obj)
     53 except KeyError as key_error:
     54     raise KeyError(
     55         f"Missing property {key_error} in {obj}. Check the types. Failed for path {properties}"
     56     ) from key_error

IndexError: list index out of range

Parameters

obj (Dict[Any, Any]) – The dict to traverse.
properties (List[int]) – List of properties to be parsed as a path to be navigated in the dict.

Returns

A value within a deeply nested dict.

Return type

Any

Raises

KeyError – Missing property in the dictionary.
TypeError – Properties don’t describe a path due to possible type error.

validate_type(obj, obj_type)[source]¶

Raise TypeError on object type mismatch.

This is syntatic sugar for instance type checks.

The check is by exclusion of types. Wraps exception raising logic.

param obj

An object available for type assertion

type obj

Any

param obj_type

This must match the type of the object.

type obj_type

(Union[type, Tuple[type]])

return

rtype

raises TypeError

If the type obj_type doesn’t match the type of obj.

Return type: None

dialogy.utils.naive_lang_detect module¶

lang_detect_from_text(text)[source]¶

Return type: str

dialogy.utils.normalize_utterance module¶

This module was created in response to: https://github.com/Vernacular-ai/dialogy/issues/9 we will ship functions to assist normalization of ASR output, we will refer to these as Utterances.

dict_get(prop, obj)[source]¶

Get value of prop within obj.

This simple function exists to facilitate a partial function defined here.

Parameters

prop (str) – A property within a dict.
obj (Dict[str, Any]) – A dict.

Returns

Value of a property within a dict.

Return type

Any

get_best_transcript(transcripts)[source]¶

Select the best transcript from a list of transcripts. The best transcript is the first transcript gven by ASR (20220803)

Parameters: transcripts (List[str]) – List of transcripts
Returns: A string containing the best transcript
Return type: str

is_each_element(type_, input_, transform=<function <lambda>>)[source]¶

Check if each element in a list is of a given type.

Parameters

type (Type) – Expected Type of each element in the input_ which is a list.
input (List[Any]) – A list.
transform (Callable[[Any], Any]) – We may apply some transforms to each element before making these checks. This is to check if a certain key in a Dict matches the expected type. In case this is not needed, leave the argument unset and an identity transform is applied. Defaults to lambda x:x.

Returns

Checks each element in a list to match type_, if any element fails the check, this returns False, else True.

Return type

bool

is_list(input_)[source]¶

Check type of input

Parameters: input (Any) – Any arbitrary input
Returns: True if input is a list else False
Return type: True

is_list_of_string(maybe_utterance)[source]¶

Check input to be of List[str].

In [1]: from dialogy.utils.normalize_utterance import is_list_of_string

In [2]: is_list_of_string(["this", "works"])
Out[2]: True

Parameters: maybe_utterance (Any) – Arbitrary input.
Returns: True if maybe_utterance is a str.
Return type: bool

is_string(maybe_utterance)[source]¶

Check input’s type is str.

Parameters: maybe_utterance (Any) – Arbitrary type input.
Returns: True if maybe_utterance is a str, else False.
Return type: bool

is_unsqueezed_utterance(maybe_utterance, key='transcript')[source]¶

Check input to be of List[Dict].

In [1]: from dialogy.utils.normalize_utterance import is_unsqueezed_utterance

# 1. This fails
In [2]: is_unsqueezed_utterance([[{"transcript": "this"}, {"transcript": "works"}]])
Out[2]: False

# 2. key is configurable
In [3]: is_unsqueezed_utterance([{"text": "this"}, {"text": "works"}], key="text")
Out[3]: True

Parameters

maybe_utterance (Any) – Arbitrary type input.
key (str, Defaults to const.TRANSCRIPT.) – The key within which transcription string resides.

Returns

True, if the input is of type List[Dict[str, Any]] else False.

Return type

bool

is_utterance(maybe_utterance, key='transcript')[source]¶

Check input to be of List[List[Dict]].

In [1]: from dialogy.utils.normalize_utterance import is_utterance

# 1. :code:`List[List[Dict[str, str]]]`
In [2]: is_utterance([[{"transcript": "this"}, {"transcript": "works"}]])
Out[2]: True

# 2. key is configurable
In [3]: is_utterance([[{"text": "this"}, {"text": "works"}]], key="text")
Out[3]: True

# 3. Hope for everything else... you have a mastercard.
# Or use this lib, works just fine 🍷.
In [4]: is_utterance([{"transcript": "this"}, {"transcript": "doesn't"}, {"transcript": "work"}])
Out[4]: False

Parameters

maybe_utterance (Any) – Arbitrary input.
key (str) – The key within which transcription string resides. Defaults to const.TRANSCRIPT.

Returns

True if the inputs is List[List[Dict[str, str]]], else False.

Return type

bool

normalize(maybe_utterance, key='transcript')[source]¶

Adapt various non-standard ASR alternative forms.

The output will be a list of strings since models will expect that.

In [1]: In [1]: from dialogy.utils.normalize_utterance import normalize
   ...: In [2]: # A popular case
   ...: In [3]: normalize([[{"transcript": "this"}, {"transcript": "works"}]])
   ...: 
Out[1]: ['this', 'works']

In [2]: In [3]: # A case with multiple utterances
   ...: In [4]: normalize([
   ...:             [{"transcript": "hello hello?", "transcript": "yellow yellow?"}],
   ...:             [{"transcript": "I wanted to check"}],
   ...:             [{"transcript": "if you have space for us?"}]
   ...:         ])
   ...: 
Out[2]: ['yellow yellow? I wanted to check if you have space for us?']

In [3]: In [5]: normalize([{"transcript": "I wanted to know umm hello?"}])
Out[3]: ['I wanted to know umm hello?']

In [4]: In [6]: normalize(["I wanted to know umm hello?"])
Out[4]: ['I wanted to know umm hello?']

In [5]: In [7]: normalize("I wanted to know umm hello?")
Out[5]: ['I wanted to know umm hello?']

Parameters

maybe_utterance (Any) – Arbitrary input.
key (str) – A string to be looked into List[List[Dict[str, str]]], List[Dict[str, str]] type inputs.

Returns

A flattened list of strings parsed from various formats.

Return type

List[str]

Raises

TypeError: If maybe_utterance is none of the expected types.

utterance2alternatives(utterances, key='transcript')[source]¶

Convert a list of utterances to a list of alternatives.

Return type: List[str]

dialogy.utils.temperature_scaling module¶

T_scaling(logits, temperature)[source]¶

Return type: Tensor

calc_bins(preds, labels_oneh)[source]¶

Return type: Any

fit_ts_parameter(logits_list, labels_list, lr=0.001, max_iter=10000, device=device(type='cuda'))[source]¶

Return type: float

get_metrics(preds, labels_oneh)[source]¶

Return type: Tuple[float, float]

save_reliability_graph(preds, labels_oneh, dir_path, prefix)[source]¶

Return type: None

dialogy.utils package¶

Submodules¶

dialogy.utils.datetime module¶

dialogy.utils.file_handler module¶

dialogy.utils.logger module¶

dialogy.utils.misc module¶

dialogy.utils.naive_lang_detect module¶

dialogy.utils.normalize_utterance module¶

dialogy.utils.temperature_scaling module¶

Module contents¶