Python SDK
The Python SDK is generated from the Shaped API schema, with rich types and inline descriptions so you can use your IDE’s IntelliSense/auto‑completion to discover methods, parameters, and return types as you work.
Install
Install the SDK from PyPI:
pip install shaped
Verify the installation:
pip show shaped
Instantiate the client
The core entry point is the Client class. You typically construct it once at
application startup and reuse it.
from shaped import Client
client = Client(api_key="YOUR_API_KEY")
You can store the API key in an environment variable and read it at runtime:
import os
from shaped import Client
api_key = os.environ["SHAPED_API_KEY"]
client = Client(api_key=api_key)
Load data from raw records
This section shows how to:
- Define a custom table schema
- Insert raw Python records into the table
1. Create a custom table
Use a CUSTOM schema when you want to control the column definitions and
upload data directly:
table_config = {
"schema_type": "CUSTOM",
"name": "pixar_movies",
"column_schema": {
"item_id": "Int64",
"movie_title": "String",
"poster_url": "String",
"description": "String",
"release_date": "String",
"cast": "Array(String)",
},
}
client.create_table(table_config)
2. Insert raw records
You can insert Python dicts directly using insert_table_rows:
records = [
{
"item_id": 187541,
"movie_title": "Incredibles 2 (2018)",
"poster_url": "https://m.media-amazon.com/images/M/MV5BMTEzNzY0OTg0NTdeQTJeQWpwZ15BbWU4MDU3OTg3MjUz._V1_QL75_UX380_CR0,0,380,562_.jpg",
"description": "The Incredibles family takes on a new mission which involves a change in family roles.",
"release_date": "2018-06-15",
"cast": ["Craig T. Nelson", "Holly Hunter"],
},
{
"item_id": 177765,
"movie_title": "Coco (2017)",
"poster_url": "https://m.media-amazon.com/images/M/MV5BMDIyM2E2NTAtMzlhNy00ZGUxLWI1NjgtZDY5MzhiMDc5NGU3XkEyXkFqcGc@._V1_QL75_UY562_CR7,0,380,562_.jpg",
"description": "Aspiring musician Miguel enters the Land of the Dead to find his great‑great‑grandfather.",
"release_date": "2017-11-22",
"cast": ["Anthony Gonzalez", "Gael García Bernal"],
},
]
client.insert_table_rows("pixar_movies", records)
Create engines with vector embeddings
Engines define how data is indexed and scored. A common pattern is to:
- Point the engine at a table
- Configure embeddings for semantic / vector search
- Create the engine to start encoding
1. Define the engine
from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
semantic_search_engine = EngineConfigV2(
name="semantic_search",
data=DataConfig(),
)
2. Connect to an item table
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig
semantic_search_engine.data = DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="pixar_movies")
)
)
3. Configure an embedding index
Choose the text fields to encode and the embedding model:
from shaped.autogen.models.index_config import IndexConfig
from shaped.autogen.models.embedding_config import EmbeddingConfig
from shaped.autogen.models.encoder import Encoder
from shaped.autogen.models.hugging_face_encoder import HuggingFaceEncoder
fields_to_encode = ["movie_title", "description"]
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
semantic_search_engine.index = IndexConfig(
embeddings=[
EmbeddingConfig(
name="movie_text_embedding",
encoder=Encoder(
HuggingFaceEncoder(
model_name=embedding_model,
item_fields=fields_to_encode,
)
),
)
]
)
4. Create the engine (start encoding)
client.create_engine(engine_config=semantic_search_engine)
Once the engine reaches the ACTIVE status, it is ready to serve queries.
Load data from connectors (MongoDB example)
Instead of pushing records directly from your application, you can configure a MongoDB connector table and let Shaped pull data on a schedule.
The MongoDB connector is configured via the Tables API or CLI, but you typically still query the synced table via the SDK.
1. Define a MongoDB table
Use the create_table method with table options to supply your MongoDB credentials.
table_config = {
"name": "mongodb_dataset",
"schema_type": "MONGODB",
"collection": "movies",
"database": "movielens",
"mongodb_connection_string": "mongodb://user:password@host:port/database",
"start_date": "2024-01-01",
}
client.create_table(table_config)
After the connector finishes syncing, the table mongodb_dataset is available
to engines and queries, just like a custom table.
See the Connector Reference for a full list of external data sources you can sync with.
2. Use the connector table from Python
You typically reference the connector table when configuring an engine or writing ShapedQL. For example, a semantic search engine on MongoDB data:
from shaped.autogen.models.engine_config_v2 import EngineConfigV2
from shaped.autogen.models.data_config import DataConfig
from shaped.autogen.models.data_config_interaction_table import DataConfigInteractionTable
from shaped.autogen.models.reference_table_config import ReferenceTableConfig
from shaped.autogen.models.index_config import IndexConfig
from shaped.autogen.models.embedding_config import EmbeddingConfig
from shaped.autogen.models.encoder import Encoder
from shaped.autogen.models.hugging_face_encoder import HuggingFaceEncoder
mongo_engine = EngineConfigV2(
name="mongodb_semantic_search",
data=DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(name="mongodb_dataset")
)
),
)
mongo_engine.index = IndexConfig(
embeddings=[
EmbeddingConfig(
name="mongo_text_embedding",
encoder=Encoder(
HuggingFaceEncoder(
model_name="sentence-transformers/all-MiniLM-L6-v2",
item_fields=["document"], # JSON document column
)
),
)
]
)
client.create_engine(engine_config=mongo_engine)
For more details on table options, see the MongoDB connector docs.
Query data with the fluent builder
The Python SDK exposes a fluent query builder that compiles to ShapedQL. This is the recommended way to build most ranking queries in application code.
1. Basic lexical search
from shaped import RankQueryBuilder, TextSearch
query = (
RankQueryBuilder()
.from_entity("item")
.retrieve(
TextSearch(
input_text_query="$query",
mode={"type": "lexical"},
limit=50,
)
)
.limit(20)
.build()
)
results = client.execute_query(
engine_name="text_search",
query=query,
parameters={"query": "Incredibles"},
return_metadata=True,
)
2. Vector / semantic search
from shaped import RankQueryBuilder, TextSearch
query = (
RankQueryBuilder()
.from_entity("item")
.retrieve(
TextSearch(
input_text_query="$query",
mode={"type": "vector", "text_embedding_ref": "movie_text_embedding"},
limit=50,
)
)
.limit(20)
.build()
)
results = client.execute_query(
engine_name="semantic_search",
query=query,
parameters={"query": "animated superhero family"},
return_metadata=True,
)
3. Hybrid search with multiple retrievers
from shaped import RankQueryBuilder, TextSearch
query = (
RankQueryBuilder()
.from_entity("item")
.retrieve(
[
TextSearch(
input_text_query="$query",
mode={"type": "lexical"},
limit=50,
name="lexical_search",
),
TextSearch(
input_text_query="$query",
mode={"type": "vector", "text_embedding_ref": "movie_text_embedding"},
limit=50,
name="vector_search",
),
]
)
.limit(20)
.build()
)
results = client.execute_query(
engine_name="hybrid_search",
query=query,
parameters={"query": "Pixar movies about family"},
return_metadata=True,
)
4. Personalized search with a value model
from shaped import RankQueryBuilder, TextSearch
query = (
RankQueryBuilder()
.from_entity("item")
.retrieve(
[
TextSearch(
input_text_query="$query",
mode={"type": "lexical"},
limit=50,
name="lexical_search",
),
TextSearch(
input_text_query="$query",
mode={"type": "vector", "text_embedding_ref": "movie_text_embedding"},
limit=50,
name="vector_search",
),
]
)
.score(
value_model="click_through_rate",
input_user_id="$user_id",
input_interactions_item_ids="$interaction_item_ids",
)
.limit(20)
.build()
)
results = client.execute_query(
engine_name="personalized_search",
query=query,
parameters={
"query": "Pixar",
"user_id": "user1",
"interaction_item_ids": ["187541", "177765", "1"],
},
return_metadata=True,
)
Next steps
- Use case guides: See Text search, Semantic search, Hybrid search, and Personalized search for end‑to‑end flows.
- Connectors: See the MongoDB connector and other connectors in the Connectors overview.