LLM

LLM transform plugin

Description

Leverage the power of a large language model (LLM) to process data by sending it to the LLM and receiving the generated results. Utilize the LLM's capabilities to label, clean, enrich data, perform data inference, and more.

Options

name

type

required

default value

model_provider

enum

yes

output_data_type

enum

String

prompt

string

yes

model

string

yes

api_key

string

yes

openai.api_path

string

https://api.openai.com/v1/chat/completions

model_provider

The model provider to use. The available options are: OPENAI

output_data_type

The data type of the output data. The available options are: STRING,INT,BIGINT,DOUBLE,BOOLEAN. Default value is STRING.

prompt

The prompt to send to the LLM. This parameter defines how LLM will process and return data, eg:

The data read from source is a table like this:

name

age

Jia Fan

Hailin Wang

Eric

Guangdong Liu

The prompt can be:

Determine whether someone is Chinese or American by their name

The result will be:

name

age

llm_output

Jia Fan

Chinese

Hailin Wang

Chinese

Eric

American

Guangdong Liu

Chinese

model

The model to use. Different model providers have different models. For example, the OpenAI model can be gpt-4o-mini. If you use OpenAI model, please refer https://platform.openai.com/docs/models/model-endpoint-compatibility of /v1/chat/completions endpoint.

api_key

The API key to use for the model provider. If you use OpenAI model, please refer https://platform.openai.com/docs/api-reference/api-keys of how to get the API key.

openai.api_path

The API path to use for the OpenAI model provider. In most cases, you do not need to change this configuration. If you are using an API agent's service, you may need to configure it to the agent's API address.

common options [string]

Transform plugin common parameters, please refer to Transform Plugin for details

Example

Determine the user's country through a LLM.

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  FakeSource {
    row.num = 5
    schema = {
      fields {
        id = "int"
        name = "string"
      }
    }
    rows = [
      {fields = [1, "Jia Fan"], kind = INSERT}
      {fields = [2, "Hailin Wang"], kind = INSERT}
      {fields = [3, "Tomas"], kind = INSERT}
      {fields = [4, "Eric"], kind = INSERT}
      {fields = [5, "Guangdong Liu"], kind = INSERT}
    ]
  }
}

transform {
  LLM {
    model_provider = OPENAI
    model = gpt-4o-mini
    api_key = sk-xxx
    prompt = "Determine whether someone is Chinese or American by their name"
  }
}

sink {
  console {
  }
}

PreviousJsonPath NextReplace

Last updated 9 months ago

Description​

Options​

model_provider​

output_data_type​

prompt​

model​

api_key​

openai.api_path​

common options [string]​

Example​

Description

Options

model_provider

output_data_type

prompt

model

api_key

openai.api_path

common options [string]

Example