Quickstart
In this tutorial, you'll learn how to upload your first data table and train your first engine. You can do it via the CLI or in the console.
- CLI
- Console
Install the CLI
The command-line interface is the fastest way to update your config in Shaped:
pip install shaped
Having trouble installing the CLI? Common errors
Initialize the client with your API key
Add your API key to authenticate into your account. If you don't have an API key yet, see Get your API key.
shaped init --api-key <YOUR_API_KEY>
Download sample data and engine config
To start quickly, use our sample data based on movielens to build a semantic search model.
Download the sample data and engine config from S3:
curl -L \
-O "https://shaped-onboarding-demo-datasets.s3.us-east-2.amazonaws.com/movielens/movielens_items.jsonl" \
-O "https://shaped-onboarding-demo-datasets.s3.us-east-2.amazonaws.com/movielens/movielens_semantic_search.yaml"
Upload your table
Upload the table that you downloaded locally:
shaped create-table-from-uri --name movielens_items --path movielens_items.jsonl --type jsonl
Create your engine
Every relevance engine is configured with YAML. Upload the config in movielens_semantic_search.yaml using the CLI to create a semantic search engine:
shaped create-engine --file movielens_semantic_search.yaml
Full engine config
version: v2
name: movielens_semantic_search
data:
item_table:
name: movielens_items
type: table
index:
lexical_search: # enables BM25 lexical search on item fields
item_fields:
- movie_title
- description
- cast
- genres
- writers
- directors
- interests
embeddings:
- name: content_embedding # enables vector search
encoder:
type: hugging_face
model_name: sentence-transformers/all-MiniLM-L6-v2
batch_size: 256
item_fields:
- movie_title
- description
- cast
- genres
- writers
- directors
- interests
Step 1: Get the data from Github
We'll use the movielens table enriched with data from IMDb and build a semantic search model.
Download the table data here - movies_2018_enriched.jsonl
Step 2: Upload the data as a new table
- Open your Shaped dashboard
- Click Tables in the leftpane
- Click Add new table in the top right
Step 3: Configure a new engine
- In the leftpane, go to Engines
- Click Upload a new engine in the top right
- Paste the following YAML into the engine config editor:
version: v2
name: movielens_movie_recommendation_v2
data:
# specify a Shaped table to represent the items in your catalog
item_table:
name: movies_2018
type: table
# define how to index your data
index:
# specify a model to use to generate embeddings
embeddings:
- name: text_embedding
encoder:
type: hugging_face
model_name: sentence-transformers/all-MiniLM-L6-v2
batch_size: 256
item_fields:
- movie_title
Step 4: Query your engine
- In the leftpane, go to the Query section
- Copy the query below into the query editor on the left
- Press Run
SELECT *
FROM similarity(embedding_ref='text_embedding', limit=50, encoder='item_attribute_pooling', input_item_id=$item_id)
LIMIT 20
With parameters (replace db1234 with an actual item ID from your table):
{
"query": "SELECT * FROM similarity(embedding_ref='text_embedding', limit=50, encoder='item_attribute_pooling', input_item_id=$item_id) LIMIT 20",
"parameters": {
"item_id": "db1234"
}
}
Once your engine is up and running, you’re ready to integrate it into your application. Read the Queries guide to learn how to get search results using your engine.