skit_pipelines.pipelines.generate_and_tag_conversations package

Module contents

generate_and_tag_conversations(*, situations: str = '', scenario: str = '', scenario_category: str = '', s3_links_to_prompts: str = '', llm_trainer_repo_name: str = 'LLMtrainer', llm_trainer_repo_branch: str = 'main', model: str = 'gpt-4', n_iter: int = 1, n_choice: int = 2, temperature: float = 0.99, client_id: str, template_id: str, labelstudio_project_id: str, data_label: str = '', notify: str = '', channel: str = '', slack_thread: str = '')[source]

A pipeline to generate and tag conversations given a situation

Example payload to invoke via slack integrations:

A minimal example:

@charon run generate_and_tag_conversations

{   "situations" : "The user disputes the debt, so the agent transfers the call to the agent :: The user cannot pay any amount as they have a difficult situation, so the agent hangs up the call. ",
    "scenario" : "Test scenario",
    "scenario_category" : "Test scenario category",
    "llm_trainer_repo_branch" : "refactor-data-gen-script",
    "client_id" : "85",
    "template_id" : "0",
    "labelstudio_project_id" : "95",
    "s3_links_to_prompts": "s3://kubeflow-us-cluster/pipeline_uploads/prompt/test_prompt.txt",
    "data_label" : "UAT"
}

A full available parameters example:

@charon run generate_and_tag_conversations

{   "situations" : "The user disputes the debt, so the agent transfers the call to the agent :: The user cannot pay any amount as they have a difficult situation, so the agent hangs up the call. ",
    "scenario" : "Test scenario",
    "scenario_category" : "Test scenario category",
    "llm_trainer_repo_branch" : "refactor-data-gen-script",
    "client_id" : "85",
    "template_id" : "0",
    "labelstudio_project_id" : "95",
    "s3_links_to_prompts": "s3://kubeflow-us-cluster/pipeline_uploads/prompt/test_prompt.txt",
    "data_label" : "UAT"
}
Parameters
  • situations (optional) – The situations for generating the conversations, use delimiter :: to pass multiple situations

  • scenario (optional) – The scenario linked to the situation

  • scenario_category (optional) – The scenarios category

  • prompt – Prompt to the model for data generation

type prompt: str

Parameters
  • s3_links_to_prompts (str) – s3 links to the prompt to the model for data generation

  • output_dir (str) – The output directory where the generated conversations gets stored

  • filename (str) – Acts as a prfix to the default naming used

  • llm_trainer_repo_name (str) – The conversation generation repo name in Github.

  • llm_trainer_repo_branch (str, optional) – The branch name in the conversation generation repo to use , defaults to main.

  • model (str) – Optional model to be used for generating data

  • n_iter – No of times we make iterate on scenarios list to generate conversations

type n_iter: int

Parameters

n_choice – No of convs generated in a single time from a scenario.

type n_choice: int

Parameters

temperature – Temperature

type temperature: float

Parameters

client_id – id of the client for which data is being generated

:type client_id : str

Parameters

template_id – template id for which data is being generated

:type template_id : str

Parameters
  • notify (str, optional) – Whether to send a slack notification, defaults to “”

  • channel (str, optional) – The slack channel to send the notification, defaults to “”

  • slack_thread (str, optional) – The slack thread to send the notification, defaults to “”