Quickstart
This is a preview of the new Shaped docs. Found an issue or have feedback? Let us know!
In this tutorial, you'll learn how to upload your first data table and train your first engine. You can do it via the CLI or in the console.
- CLI
- Console
Install the CLI
The command-line interface is the fastest way to update your config in Shaped:
pip install shaped
Having trouble installing the CLI? Common errors
Initialize the client with your API key
Add your API key to authenticate into your account:
shaped init --api-key <YOUR_API_KEY>
Download sample dataset and engine config
To start quickly, use our sample dataset based on movielens to build a semantic search model.
Download the sample dataset and engine config from S3:
curl -L \
-O "https://shaped-onboarding-demo-datasets.s3.us-east-2.amazonaws.com/movielens/movielens_items.jsonl" \
-O "https://shaped-onboarding-demo-datasets.s3.us-east-2.amazonaws.com/movielens/movielens_semantic_search.yaml"
Upload your dataset
Upload the dataset that you downloaded locally:
shaped create-table-from-uri --name movielens_items --path movielens_items.jsonl --type jsonl
Create your engine
Every relevance engine is configured with YAML. Upload the config in movielens_semantic_search.yaml using the CLI to create a semantic search engine:
shaped create-engine --file movielens_semantic_search.yaml
Full engine config
version: v2
name: movielens_semantic_search
data:
item_table:
name: movielens_items
index:
lexical_search: # enables BM25 lexical search on item fields
item_fields:
- movie_title
- description
- cast
- genres
- writers
- directors
- interests
embeddings:
- name: content_embedding # enables vector search
encoder:
type: hugging_face
model_name: sentence-transformers/all-MiniLM-L6-v2
item_fields:
- movie_title
- description
- cast
- genres
- writers
- directors
- interests
Step 1: Get the data from Github
We'll use the movielens dataset enriched with data from IMDb and build a semantic search model.
Get the dataset here - Download movielens.jsonl
Step 2: Upload the data as a new table
- Open your Shaped dashboard
- Click Tables in the leftpane
- Click Add new table in the top right
Step 3: Configure a new engine
- In the leftpane, go to Engines
- Click Upload a new engine in the top right
- Paste the following YAML into the query editor:
version: v2
name: movielens_movie_recommendation_v2
data:
# specify a Shaped table to represent the items in your catalog
item_dataset:
name: movies_2018
# define how to index your data
index:
# specify a model to use to generate embeddings
embeddings:
- name: text_embedding
encoder:
type: hugging_face
model_name: sentence-transformers/all-MiniLM-L6-v2
item_fields:
- movie_title
Step 4: Query your engine
- In the leftpane, go to the Query section
- Copy the query below into the left-hand side
- Press Run
query:
type: rank_items
retrieve:
name: similar_items
embedding_ref: als
type: item_similarity
query_encoder:
input_item_id: $param.item_id
type: item_attribute_pooling
params:
item_id: db1234
Once your engine is up and running, you’re ready to integrate it into your application. Read the Queries guide to learn how to get search results using your engine.