LLM Enrichment Transforms
LLM Enrichment Transforms enhance your existing Shaped Datasets using large language models (LLMs). They take columns from a source dataset, apply an LLM with a prompt, and output new columns with enriched information. Use them specifically to enrich data that will power semantic search.
Prerequisites
- You have already created a Shaped Dataset to use as the source
- You have an API key for the Shaped API
How it works
- Select columns from an existing Shaped Dataset.
- Pass those columns to an LLM with task-specific instructions.
- The transform writes a new enriched dataset with additional features (e.g., semantic attributes, improved text). For example, with fashion products, you can enrich titles/descriptions into detailed attributes like style, material, fit, and use-case.
These enriched datasets can be used like Shaped Datasets for building and training Shaped models.
API
All requests require these headers:
headers = {
"x-api-key": SHAPED_API_KEY,
"Content-Type": "application/json"
}
Create a transform
curl -X POST 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "movielens_items_llm_enrichment_transform",
"description": "LLM enrichment transforms on Movielens items data.",
"transform_type": "LLM_ENRICHMENT",
"source_dataset_name": "movielens_items",
"source_columns": [
{"name": "movie_title", "include_in_output": true},
{"name": "movie_id", "include_in_output": true}
]
}'
Body parameters
- name (string, required): Unique name for the transform. Example:
movielens_items_llm_enrichment_transform
. - description (string, optional): Human-readable description. Example:
LLM enrichment transforms on Movielens items data.
- transform_type (string, required): Must be
LLM_ENRICHMENT
for LLM enrichment transforms. - source_dataset_name (string, required): Name of the existing Shaped Dataset to enrich. Example:
movielens_items
. - source_columns (array of objects, required): Columns from the source dataset used by the transform.
- source_columns[].name (string, required): Column name. Example:
movie_title
. - source_columns[].include_in_output (boolean, optional): Copy the original column into the enriched dataset output. Example:
true
.
- source_columns[].name (string, required): Column name. Example:
List transforms
curl -X GET 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Get transform details
curl -X GET 'https://api.shaped.ai/v1/transforms/movielens_items_llm_enrichment_transform/' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Delete a transform
curl -X DELETE 'https://api.shaped.ai/v1/transforms/transform_name' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Usage
- We use LLM Enrichment Transforms only to enrich the data.
- The enriched data improves search system performance by adding semantic features.
- The enriched dataset can be used like Shaped Datasets to build and train Shaped models.