Skip to main content

Quickstart

Shaped is an AI-native retrieval engine for agents, search and feeds. In this guide you will upload some data and create a searchable index of semantic text and image features.

tip

Using AI? Start with our Agent Quickstart instead.

1. Install the Shaped SDK

pip install shaped

2. Initialize the client with your API key

Add your API key to authenticate into your account. If you don't have an API key yet, see Get your API key.

client = Client(api_key="YOUR_KEY_HERE")

3. Create table and insert data

Shaped stores data in tables. Data can be inserted manually or synced from an external source. Start by declaring a table of movie data and inserting a few rows of sample data.

table_config = {
"schema_type": "CUSTOM",
"name": "pixar_movies",
"column_schema": {
"item_id": "Int64",
"movie_title": "String",
"image_url": "String",
"description": "String",
"release_date": "String",
"cast": "Array(String)",
},
}

shaped.create_table(table_config)

Once the table schema is created, you can upload your data directly. You'll declare the rows as an object and then use the table insert method to write to the table.

records = [
{"item_id": 187541, "movie_title": "Incredibles 2 (2018)", "image_url": "https://m.media-amazon.com/images/M/MV5BMTEzNzY0OTg0NTdeQTJeQWpwZ15BbWU4MDU3OTg3MjUz._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "The Incredibles family takes on a new mission which involves a change in family roles: Bob Parr (Mr. Incredible) must manage the house while his wife Helen (Elastigirl) goes out to save the world.", "release_date": "2018-06-15", "cast": ["Craig T. Nelson", "Holly Hunter", "Sarah Vowell", "Huck Milner", "Catherine Keener", "Eli Fucile", "Bob Odenkirk", "Samuel L. Jackson", "Michael Bird", "Sophia Bush", "Brad Bird", "Brad Bird", "Nicole Paradis Grindle", "John Walker", "Michael Giacchino", "Stephen Schaffer", "Natalie Lyon", "Kevin Reher", "Ralph Eggleston"]},
{"item_id": 177765, "movie_title": "Coco (2017)", "image_url": "https://m.media-amazon.com/images/M/MV5BMDIyM2E2NTAtMzlhNy00ZGUxLWI1NjgtZDY5MzhiMDc5NGU3XkEyXkFqcGc@._V1_QL75_UY562_CR7,0,380,562_.jpg", "description": "Aspiring musician Miguel, confronted with his family's ancestral ban on music, enters the Land of the Dead to find his great-great-grandfather, a legendary singer.", "release_date": "2017-11-22", "cast": ["Anthony Gonzalez", "Gael García Bernal", "Benjamin Bratt", "Alanna Ubach", "Renee Victor", "Jaime Camil", "Alfonso Arau", "Herbert Siguenza", "Gabriel Iglesias", "Lombardo Boyar", "Lee Unkrich", "Lee Unkrich", "Jason Katz", "Matthew Aldrich", "Adrian Molina", "Darla K. Anderson", "Michael Giacchino", "Steve Bloom", "Lee Unkrich", "Carla Hool", "Natalie Lyon", "Kevin Reher", "Harley Jessup"]},

Use the insert table rows method to add your data to the table you created:

shaped.insert_table_rows("pixar_movies", records)

4. Create a view to enrich your data

Create an AI Enrichment View that uses description and image_url to generate a keyword_tags column for better search and filtering.

from shaped import View

view_config = View.AI(
name="pixar_movie_keyword_tags",
source_dataset="pixar_movies",
source_columns=["description", "image_url", "movie_title", "item_id"],
output_columns=["keyword_tags"],
prompt="Given information about movies such as title, description and a poster image, create a new feature keyword_tags which is a set of descriptive tags that capture the movie's setting and subject matter. Return a comma-separated list of tags with [ and ].",
)
shaped.create_view(view_config)

You can create engines at different levels of complexity. Each section below builds an engine and calls create_engine() so you can run it as soon as you finish the step.

5. Create a basic retrieval engine

Create a simple engine that contains your pixar_movies table and no embeddings yet. Call create_engine() to deploy it.

from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig

basic_engine = EngineConfigV2(
name="basic_engine",
data=DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_movies")
)
),
)
client.create_engine(engine_config=basic_engine)

6. Create an engine with text embeddings

Shaped supports any Hugging Face embedding model. We also have multiple in-house embedding models. Add an embedding to enable semantic text search on your engine. This example uses sentence-transformers/all-MiniLM-L6-v2 and encodes the title, description, and cast columns.

from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig
from shaped.autogen.models.index_config import IndexConfig
from shaped.autogen.models.embedding_config import EmbeddingConfig
from shaped.autogen.models.encoder import Encoder
from shaped.autogen.models.hugging_face_encoder import HuggingFaceEncoder

pixar_engine = EngineConfigV2(
name="pixar_text_search_engine",
data=DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_movies")
)
),
index=IndexConfig(
embeddings=[
EmbeddingConfig(
name="movie_text_embedding",
encoder=Encoder(
HuggingFaceEncoder(
model_name="sentence-transformers/all-MiniLM-L6-v2",
item_fields=["movie_title", "description", "cast"],
)
),
)
]
),
)
client.create_engine(engine_config=pixar_engine)

7. Create an engine with image embeddings

In this example, we add a CLIP image embedding (e.g. jinaai/jina-clip-v2 or openai/clip-vit-base-patch32) to enable text-to-image search. CLIP encodes both images and text into the same vector space.

from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig
from shaped.autogen.models.index_config import IndexConfig
from shaped.autogen.models.embedding_config import EmbeddingConfig
from shaped.autogen.models.encoder import Encoder
from shaped.autogen.models.hugging_face_encoder import HuggingFaceEncoder

image_engine = EngineConfigV2(
name="image_search_engine",
data=DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_movies")
)
),
index=IndexConfig(
embeddings=[
EmbeddingConfig(
name="image_embedding",
encoder=Encoder(
HuggingFaceEncoder(
model_name="openai/clip-vit-base-patch32",
item_fields=["image_url"],
)
),
)
]
),
)
client.create_engine(engine_config=image_engine)

8. Create an engine that learns from user behavior

This is where Shaped really shines. Unlike a vector DB or plain semantic search, Shaped lets you train behavioral models on your own data. Turn clicks, likes, and purchases into collaborative embeddings and scoring models that predict engagement. This is how you get truly personalized results that beat generic vector or keyword search.

Add interaction data

Create an interactions table with user_id, item_id, label, created_at, and optional event_value columns. The sample data uses a 5-star rating scale: rated_1_stars through rated_5_stars with matching labels 1–5 so the model learns from explicit feedback strength. Insert the sample data to train your engine.

interaction_table_config = {
"schema_type": "CUSTOM",
"name": "pixar_interactions",
"column_schema": {
"user_id": "String",
"item_id": "Int64",
"label": "Int64",
"created_at": "String",
"event_value": "String",
},
}
shaped.create_table(interaction_table_config)

interactions = [
{"user_id": "user_1", "item_id": 1, "label": 5, "created_at": "2024-01-15", "event_value": "rated_5_stars"},
{"user_id": "user_1", "item_id": 3114, "label": 4, "created_at": "2024-01-16", "event_value": "rated_4_stars"},
{"user_id": "user_1", "item_id": 4886, "label": 3, "created_at": "2024-01-17", "event_value": "rated_3_stars"},
{"user_id": "user_1", "item_id": 6377, "label": 2, "created_at": "2024-01-18", "event_value": "rated_2_stars"},
{"user_id": "user_1", "item_id": 8961, "label": 1, "created_at": "2024-01-19", "event_value": "rated_1_stars"},
{"user_id": "user_2", "item_id": 45517, "label": 5, "created_at": "2024-01-20", "event_value": "rated_5_stars"},
{"user_id": "user_2", "item_id": 50872, "label": 4, "created_at": "2024-01-21", "event_value": "rated_4_stars"},
{"user_id": "user_2", "item_id": 68954, "label": 3, "created_at": "2024-01-22", "event_value": "rated_3_stars"},
{"user_id": "user_2", "item_id": 87876, "label": 2, "created_at": "2024-01-23", "event_value": "rated_2_stars"},
{"user_id": "user_2", "item_id": 95167, "label": 5, "created_at": "2024-01-24", "event_value": "rated_5_stars"},
{"user_id": "user_3", "item_id": 103141, "label": 4, "created_at": "2024-01-25", "event_value": "rated_4_stars"},
{"user_id": "user_3", "item_id": 134853, "label": 5, "created_at": "2024-01-26", "event_value": "rated_5_stars"},
{"user_id": "user_3", "item_id": 136016, "label": 1, "created_at": "2024-01-27", "event_value": "rated_1_stars"},
{"user_id": "user_3", "item_id": 157296, "label": 3, "created_at": "2024-01-28", "event_value": "rated_3_stars"},
{"user_id": "user_3", "item_id": 170957, "label": 2, "created_at": "2024-01-29", "event_value": "rated_2_stars"},
]
shaped.insert_table_rows("pixar_interactions", interactions)

Create the scoring engine

Point the engine at your item table and interaction table, add a GBDT scoring model, and create the engine.

from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig
from shaped.autogen.models.training_config import TrainingConfig
from shaped.autogen.models.models_inner import ModelsInner
from shaped.autogen.models.shaped_internal_recsys_policies_gbdt_gbdt_policy_config import (
ShapedInternalRecsysPoliciesGbdtGBDTPolicyConfig,
)

pixar_scoring_engine = EngineConfigV2(
name="pixar_scoring_engine",
data=DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_movies")
),
interaction_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_interactions")
),
),
training=TrainingConfig(
models=[
ModelsInner(
ShapedInternalRecsysPoliciesGbdtGBDTPolicyConfig(
policy_type="gbdt",
name="click_through_rate",
)
)
],
),
)
client.create_engine(engine_config=pixar_scoring_engine)

9. Next steps

Once your engine is up and running, you’re ready to integrate it into your application. Read the Queries guide to learn how to get search results using your engine.