AI Enrichment Transforms
AI Enrichment Transforms enhance your existing Shaped Datasets using large language models (LLMs). They take columns from a source dataset, apply an LLM with a prompt, and output new columns with enriched information. Use them specifically to enrich data that will power semantic search.
Prerequisites
- You have already created a Shaped Dataset to use as the source
- You have an API key for the Shaped API
How it works
- Select columns from an existing Shaped Dataset.
- Pass those columns to an LLM with task-specific instructions.
- The transform writes a new enriched dataset with additional features (e.g., semantic attributes, improved text). For example, with fashion products, you can enrich titles/descriptions into detailed attributes like style, material, fit, and use-case.
These enriched datasets can be used like Shaped Datasets for building and training Shaped models.
API
All requests require these headers:
headers = {
"x-api-key": SHAPED_API_KEY,
"Content-Type": "application/json"
}
Create a transform
curl -X POST 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "movielens_items_ai_enrichment_transform",
"description": "AI enrichment transforms on Movielens items data.",
"transform_type": "AI_ENRICHMENT",
"source_dataset": "movielens_items",
"source_columns": ["movie_title","release_date"],
"source_columns_in_output": ["movie_id", "movie_title"],
"enriched_output_columns": ["generated_genre", "generated_era", "generated_director"],
"prompt": "Given the movie_title and release_date as input data, you will have to output generated_genre, generated_era and generated_director in json format."
}'
Body parameters
- name (string, required): Unique name for the transform. Example:
movielens_items_ai_enrichment_transform
. - description (string, optional): Human-readable description. Example:
AI enrichment transforms on Movielens items data.
- transform_type (string, required): Must be
AI_ENRICHMENT
. - source_dataset (string, required): Name of the existing Shaped Dataset to enrich. Example:
movielens_items
. - source_columns (array of strings, required): Source dataset columns passed to the model. Example:
["movie_title", "release_date"]
. - source_columns_in_output (array of strings, optional): Source columns to copy into the enriched output dataset. Example:
["movie_id", "movie_title"]
. - enriched_output_columns (array of strings, required): New columns to be generated by the model. Example:
["generated_genre", "generated_era", "generated_director"]
. - prompt (string, required): Task instructions, including how to map inputs to outputs and the expected output format.
List transforms
curl -X GET 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Get transform details
curl -X GET 'https://api.shaped.ai/v1/transforms/movielens_items_ai_enrichment_transform/' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Delete a transform
curl -X DELETE 'https://api.shaped.ai/v1/transforms/movielens_items_ai_enrichment_transform' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'
Usage
- We use AI Enrichment Transforms only to enrich the data.
- The enriched data improves search system performance by adding semantic features.
- The enriched dataset can be used like Shaped Datasets to build and train Shaped models.