Skip to main content

AI Enrichment Transforms

AI Enrichment Transforms enhance your existing Shaped Datasets using large language models (LLMs). They take columns from a source dataset, apply an LLM with a prompt, and output new columns with enriched information. Use them specifically to enrich data that will power semantic search.

Prerequisites

  • You have already created a Shaped Dataset to use as the source
  • You have an API key for the Shaped API

How it works

  1. Select columns from an existing Shaped Dataset.
  2. Pass those columns to an LLM with task-specific instructions.
  3. The transform writes a new enriched dataset with additional features (e.g., semantic attributes, improved text). For example, with fashion products, you can enrich titles/descriptions into detailed attributes like style, material, fit, and use-case.

These enriched datasets can be used like Shaped Datasets for building and training Shaped models.

API

All requests require these headers:

headers = {
"x-api-key": SHAPED_API_KEY,
"Content-Type": "application/json"
}

Create a transform

curl -X POST 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"name": "movielens_items_ai_enrichment_transform",
"description": "AI enrichment transforms on Movielens items data.",
"transform_type": "AI_ENRICHMENT",
"source_dataset": "movielens_items",
"source_columns": ["movie_title","release_date"],
"source_columns_in_output": ["movie_id", "movie_title"],
"enriched_output_columns": ["generated_genre", "generated_era", "generated_director"],
"prompt": "Given the movie_title and release_date as input data, you will have to output generated_genre, generated_era and generated_director in json format."
}'

Body parameters

  • name (string, required): Unique name for the transform. Example: movielens_items_ai_enrichment_transform.
  • description (string, optional): Human-readable description. Example: AI enrichment transforms on Movielens items data.
  • transform_type (string, required): Must be AI_ENRICHMENT.
  • source_dataset (string, required): Name of the existing Shaped Dataset to enrich. Example: movielens_items.
  • source_columns (array of strings, required): Source dataset columns passed to the model. Example: ["movie_title", "release_date"].
  • source_columns_in_output (array of strings, optional): Source columns to copy into the enriched output dataset. Example: ["movie_id", "movie_title"].
  • enriched_output_columns (array of strings, required): New columns to be generated by the model. Example: ["generated_genre", "generated_era", "generated_director"].
  • prompt (string, required): Task instructions, including how to map inputs to outputs and the expected output format.

List transforms

curl -X GET 'https://api.shaped.ai/v1/transforms' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Get transform details

curl -X GET 'https://api.shaped.ai/v1/transforms/movielens_items_ai_enrichment_transform/' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Delete a transform

curl -X DELETE 'https://api.shaped.ai/v1/transforms/movielens_items_ai_enrichment_transform' \
-H 'x-api-key: '"$SHAPED_API_KEY" \
-H 'Content-Type: application/json'

Usage

  • We use AI Enrichment Transforms only to enrich the data.
  • The enriched data improves search system performance by adding semantic features.
  • The enriched dataset can be used like Shaped Datasets to build and train Shaped models.