skit_pipelines.pipelines.generate_and_tag_conversations package¶
Module contents¶
- generate_and_tag_conversations(*, situations: str = '', scenario: str = '', scenario_category: str = '', s3_links_to_prompts: str = '', llm_trainer_repo_name: str = 'LLMtrainer', llm_trainer_repo_branch: str = 'main', model: str = 'gpt-4', n_iter: int = 1, n_choice: int = 2, temperature: float = 0.99, client_id: str, template_id: str, labelstudio_project_id: str, data_label: str = '', notify: str = '', channel: str = '', slack_thread: str = '')[source]¶
A pipeline to generate and tag conversations given a situation
Example payload to invoke via slack integrations:
A minimal example:
@charon run generate_and_tag_conversations
{ "situations" : "The user disputes the debt, so the agent transfers the call to the agent :: The user cannot pay any amount as they have a difficult situation, so the agent hangs up the call. ", "scenario" : "Test scenario", "scenario_category" : "Test scenario category", "llm_trainer_repo_branch" : "refactor-data-gen-script", "client_id" : "85", "template_id" : "0", "labelstudio_project_id" : "95", "s3_links_to_prompts": "s3://kubeflow-us-cluster/pipeline_uploads/prompt/test_prompt.txt", "data_label" : "UAT" }
A full available parameters example:
@charon run generate_and_tag_conversations
{ "situations" : "The user disputes the debt, so the agent transfers the call to the agent :: The user cannot pay any amount as they have a difficult situation, so the agent hangs up the call. ", "scenario" : "Test scenario", "scenario_category" : "Test scenario category", "llm_trainer_repo_branch" : "refactor-data-gen-script", "client_id" : "85", "template_id" : "0", "labelstudio_project_id" : "95", "s3_links_to_prompts": "s3://kubeflow-us-cluster/pipeline_uploads/prompt/test_prompt.txt", "data_label" : "UAT" }
- Parameters
situations (optional) – The situations for generating the conversations, use delimiter :: to pass multiple situations
scenario (optional) – The scenario linked to the situation
scenario_category (optional) – The scenarios category
prompt – Prompt to the model for data generation
type prompt: str
- Parameters
s3_links_to_prompts (str) – s3 links to the prompt to the model for data generation
output_dir (str) – The output directory where the generated conversations gets stored
filename (str) – Acts as a prfix to the default naming used
llm_trainer_repo_name (str) – The conversation generation repo name in Github.
llm_trainer_repo_branch (str, optional) – The branch name in the conversation generation repo to use , defaults to main.
model (str) – Optional model to be used for generating data
n_iter – No of times we make iterate on scenarios list to generate conversations
type n_iter: int
- Parameters
n_choice – No of convs generated in a single time from a scenario.
type n_choice: int
- Parameters
temperature – Temperature
type temperature: float
- Parameters
client_id – id of the client for which data is being generated
:type client_id : str
- Parameters
template_id – template id for which data is being generated
:type template_id : str
- Parameters
notify (str, optional) – Whether to send a slack notification, defaults to “”
channel (str, optional) – The slack channel to send the notification, defaults to “”
slack_thread (str, optional) – The slack thread to send the notification, defaults to “”