Example configurations

This article goes through some common use cases for Shaped engines and how to set up your engine to achieve them. Use these as reference to build your first engine config or get more out of the ones you have.

Every engine config splits into five blocks: data, index, training, deployment, queries.

In most cases, these blocks can be configured independently of each other. The sections below go through each block and some common ways to configure them.

Data block

The data block defines what tables and connectors the engine will use.

Basic table connection

You can connect your engine to tables in your Shaped account by its name. The following example connects to three tables with no additional options.

data:
  item_table:
    name: apparel_catalog
    type: table
  user_table:
    name: customers
    type: table
  interaction_table: 
    name: transactions
    type: table

Change ID columns to `item_id` and `user_id`

You can use the query option to run an SQL query when fetching items. You can use this to change column names, add a rating column and more. For more complex SQL operations such as joins, use an SQL view.

data:
  item_table:
    type: query
    query: |
      select product_id as item_id, color, name, department, tags, updated_at, created_at
      from apparel_catalog
  user_table:
    type: query
    query: |
      select customer_id as user_id, name, gender, most_recent_interaction, joined_at
      from customers
  interaction_table:
    type: query
    query: |
      select article_id as item_id, customer_id as user_id, transaction_type, price as label, created_at
      from transactions
      where created_at < NOW() - INTERVAL '6 months'

note

Shaped looks up items and users by the item_id and user_id columns, respectively. Your engine will weigh interactions using the label column. The interaction_table also requires a created_at column. Use the query option or an SQL view to include these columns.

Specify how column features should be handled

The schema_override option allows you to specify how specific features in your tables should be handled. For example, you can specify that movie genres should be interpreted as a sequence of categories rather than a string literal. If you don't include a column in the schema override, it will be inferred.

data:
  item_table:
    name: movies
    type: table
  schema_override: 
    item:
      id: item_id
      features: # list of columns to use as features
        - name: genre
          type: Sequence[TextCategory]
        - name: poster_url
          type: Image
        - name: movie_title
          type: Text
        - name: movie_age
          type: Numerical
        - name: primary_genre
          type: TextCategory
      created_at: created_at # which column to use as the created_at feature/deduplication

Index block

The index block defines how the engine will use search and embeddings. You can use a pre-trained embedding model from huggingface, or train your own embedding model.

Create embeddings for text and image features

embeddings is a list and can take multiple options. Each embedding can be specified in the same way as shown in the example above. You can create multiple embeddings for different purposes, such as separate embeddings for text and images.

The batch_size parameter controls how many items are processed in each batch during embedding generation. It's recommended to use larger batch sizes (e.g., 256) for text embeddings and smaller batch sizes (e.g., 32) for image embeddings.

index:
  embeddings:
    - name: name_embedding
      encoder:
        type: hugging_face
        model_name: sentence-transformers/modernbert
        batch_size: 256
        item_fields:
          - title
          - url_captions
    - name: image_embedding
      encoder:
        type: hugging_face
        model_name: openai/clip-vit-base-patch32
        batch_size: 32
        item_fields:
          - image_url

Use a pre-trained model

The hugging_face encoder type supports any pre-trained encoder from Huggingface's Sentence Transformers or CLIP model library. The following example uses a sentence transformer embedding model.

index:
  embeddings:
    - name: text_embedding
      encoder:
        type: hugging_face
        model_name: sentence-transformers/all-MiniLM-L6-v2
        batch_size: 256
        item_fields:
          - title
          - url_captions

Use a trained model as an encoder

If you have trained an embedding model in the training block (e.g., ALS, Two-Tower, beeFormer), you can reference it as an encoder in index.embeddings. This produces embeddings that are trained or fine-tuned on your data, and retrained based on your training schedule.

index:
  embeddings:
    - name: trained_embedding
      encoder:
        type: trained_model
        model_ref: als_score

Run BM25 lexical search

The lexical_search field configures BM25 keyword search. Use lexical_search for exact term matching, such as looking up product names, SKUs, or IDs.

index:
  lexical_search:
    item_fields:
      - product_name
      - product_department
      - product_id
      - sku_id
      - alternate_sku

Training block

The training block defines how to train scoring models on your data.

Engines can have a single model to base scoring on, or run multiple model policies at the same time. You can combine scores from multiple models using a value model expression.

Each scoring policy trains on your data on a given schedule.

Train a model to score item similarity

training: 
  models: 
    - name: my_elsa_score_model
      policy_type: elsa
      strategy: early_stopping

Train multiple models

You can train multiple models in a single engine. Each model will be trained independently and can be used in queries.

training: 
  models: 
    - name: elsa_score
      policy_type: elsa
      strategy: early_stopping
    - name: lgbm_score
      policy_type: lightgbm

To combine scores from multiple models, you can use value model expressions in your queries.

Deployment block

The deployment block allows you to configure how the engine is deployed and served. This includes settings for compute resources, scaling, and runtime parameters.

Basic deployment

Minimal configuration using default settings (cold_tier storage, no autoscaling):

deployment:
  data_tier: cold_tier

Deployment with autoscaling

Enable autoscaling to handle variable traffic loads. This example scales based on requests per second:

deployment:
  data_tier: cold_tier
  autoscaling:
    min_replicas: 2
    max_replicas: 10
    policy:
      type: requests_per_second
      target_requests: 15.0

You can also scale based on latency:

deployment:
  data_tier: cold_tier
  autoscaling:
    min_replicas: 1
    max_replicas: 20
    policy:
      type: latency
      target_seconds: 0.5

High-performance deployment

Use fast_tier for Redis-based storage and configure online store caching for low-latency requirements:

deployment:
  data_tier: fast_tier
  server:
    worker_count: 4
  online_store:
    interaction_max_per_user: 50
    interaction_expiration_days: 30
  pagination:
    page_expiration_in_seconds: 300

Deployment with canary rollout

Use canary rollout strategy for safe, gradual deployments with automatic evaluation:

deployment:
  data_tier: cold_tier
  rollout:
    strategy:
      type: canary
      evaluation_period_minutes: 15

For immediate deployments without canary evaluation, use recreate strategy:

deployment:
  data_tier: cold_tier
  rollout:
    strategy:
      type: recreate

Queries block

The queries block contains a list of saved queries. See the Query Reference to learn how to make queries.

Similar items query

Find items similar to a given item using collaborative filtering:

queries: 
  get_similar_items: 
    query: 
      type: rank
      from: item
      retrieve:
        - type: similarity
          embedding_ref: als_embedding
          query_encoder:
            type: precomputed_item
            input_item_id: $parameters.item_id
          limit: 20
    parameters: 
      item_id:
        default: null

Search query

Search for items using semantic vector search:

queries:
  search_products:
    query:
      type: rank
      from: item
      retrieve:
        - type: text_search
          input_text_query: $parameters.query_text
          mode:
            type: vector
            text_embedding_ref: text_embedding
          limit: 20
    parameters:
      query_text:
        default: ""

Personalized feed query

Generate a personalized feed combining multiple retrieval strategies and scoring:

queries:
  personalized_feed:
    query:
      type: rank
      from: item
      retrieve:
        - type: similarity
          embedding_ref: item_content_embedding
          query_encoder:
            type: interaction_round_robin
            input_user_id: $parameters.user_id
          limit: 50
        - type: similarity
          embedding_ref: als_embedding
          query_encoder:
            type: precomputed_user
            input_user_id: $parameters.user_id
          limit: 50
        - type: column_order
          columns:
            - name: _derived_popular_rank
              ascending: true
          limit: 50
      filter:
        predicate: prebuilt('exclude_seen', input_user_id='$parameters.user_id')
      score:
        type: score_ensemble
        value_model: click_through_rate
        input_user_id: $parameters.user_id
        input_interactions_item_ids: $parameters.interaction_item_ids
      reorder:
        diversity: 0.3
        exploration: 0.2
      limit: 20
    parameters:
      user_id:
        default: null

Reranking query

Rerank a list of candidate items using a scoring model:

queries:
  rerank_candidates:
    query:
      type: rank
      from: item
      retrieve:
        - type: candidate_ids
          item_ids: $parameters.candidate_item_ids
      score:
        type: score_ensemble
        value_model: click_through_rate
        input_user_id: $parameters.user_id
        input_interactions_item_ids: $parameters.interaction_item_ids
      limit: 10
    parameters:
      candidate_item_ids:
        default: []
      user_id:
        default: null

Data block​

Basic table connection​

Change ID columns to item_id and user_id​

Specify how column features should be handled​

Index block​

Create embeddings for text and image features​

Use a pre-trained model​

Use a trained model as an encoder​

Run BM25 lexical search​

Training block​

Train a model to score item similarity​

Train multiple models​

Deployment block​

Basic deployment​

Deployment with autoscaling​

High-performance deployment​

Deployment with canary rollout​

Queries block​

Similar items query​

Search query​

Personalized feed query​

Reranking query​