Skip to main content

LLM Enrichment Transforms

LLM Enrichment Transforms enhance your existing Shaped Datasets using large language models (LLMs). They take columns from a source dataset, apply an LLM with a prompt, and output new columns with enriched information. Use them specifically to enrich data that will power semantic search.

Prerequisites

  • You have already created a Shaped Dataset to use as the source
  • You have an API key for the Shaped API

How it works

  1. Select columns from an existing Shaped Dataset.
  2. Pass those columns to an LLM with task-specific instructions.
  3. The transform writes a new enriched dataset with additional features (e.g., semantic attributes, improved text). For example, with fashion products, you can enrich titles/descriptions into detailed attributes like style, material, fit, and use-case.

These enriched datasets can be used like Shaped Datasets for building and training Shaped models.

API

All requests require these headers:

headers = {
"x-api-key": SHAPED_API_KEY,
"Content-Type": "application/json"
}

Create a transform

curl -X POST 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "movielens_items_llm_enrichment_transform",
"description": "LLM enrichment transforms on Movielens items data.",
"transform_type": "LLM_ENRICHMENT",
"source_dataset_name": "movielens_items",
"source_columns": [
{"name": "movie_title", "include_in_output": true},
{"name": "movie_id", "include_in_output": true}
]
}'

Body parameters

  • name (string, required): Unique name for the transform. Example: movielens_items_llm_enrichment_transform.
  • description (string, optional): Human-readable description. Example: LLM enrichment transforms on Movielens items data.
  • transform_type (string, required): Must be LLM_ENRICHMENT for LLM enrichment transforms.
  • source_dataset_name (string, required): Name of the existing Shaped Dataset to enrich. Example: movielens_items.
  • source_columns (array of objects, required): Columns from the source dataset used by the transform.
    • source_columns[].name (string, required): Column name. Example: movie_title.
    • source_columns[].include_in_output (boolean, optional): Copy the original column into the enriched dataset output. Example: true.

List transforms

curl -X GET 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Get transform details

curl -X GET 'https://api.shaped.ai/v1/transforms/movielens_items_llm_enrichment_transform/' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Delete a transform

curl -X DELETE 'https://api.shaped.ai/v1/transforms/transform_name' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Usage

  • We use LLM Enrichment Transforms only to enrich the data.
  • The enriched data improves search system performance by adding semantic features.
  • The enriched dataset can be used like Shaped Datasets to build and train Shaped models.