Skip to main content

Engine basics

An engine controls the indexing logic of your retrieval systems.

Creating an engine

You can create an engine in two ways:

  1. Uploading a YAML file via CLI or the Shaped console (recommended)
  2. Making a POST request to the /engines endpoint with your engine config

We recommend starting with a simple engine - with only data and index config - and then build up complexity from there.

Configuration

Shaped provides an interface to declare your retrieval engines. There are five components to each engine:

  • data - Defines what data the engine should ingest and how it is indexed
  • index - Defines how to create indexes on your data - which columns to use and what embeddings
  • training - Defines additional model training and evaluation
  • deployment - Configures additional parameters for how the engine is deployed and served
  • queries - Defines a set of 'saved queries' that can be used at runtime

Required tables and columns

Every engine must have either an item_table or user_table to index on.

Required tables:

  • item_table: Required. Holds candidate items to be looked up (search and retrieval). Must include an item_id column.

Optional tables:

  • user_table: Contains user attributes and profiles. Must include a user_id column if provided. Used for personalization based on user features.
  • interaction_table: Contains clicks, transactions, or other user-item interactions. Must include item_id, user_id, and label columns. Used for training collaborative filtering models and computing popularity signals.

Your engine can use these profiles and interaction history to do personalized retrieval.

Patching models and CI/CD

Shaped engines are designed to work with version control and CI/CD pipelines. Updates are applied incrementally with zero-downtime deployment. For detailed information on how patching works and what triggers updates, see GitOps flow and patching.

Vector index configuration and performance

The index block configured what vector indexes will be created for your engine, for fast similarity search. Vector indexes use approximate nearest neighbor (ANN) algorithms for fast query performance. Indexes update automatically when items change.

You can create multiple embeddings and query them independently or together. Each embedding has a name, which serves as the unique ID and can be passed as the embedding_ref parameter at query time.

Derived columns

When you provide an interaction_table, Shaped automatically generates two convenience columns:

  • _derived_popular_rank: Popularity score from interaction frequency and recency
  • _derived_chronological_rank: Chronological ranking based on item timestamps

Use these in column_order() for retrieval, ORDER BY score() expressions for scoring, or WHERE clauses for filtering. See Retrieve popular items for examples.

Evaluation metrics

After training completes, the following metrics are available. These are calculated on a held-out test set:

  • Recall@k: Measures the proportion of relevant items that appear within the top k recommendations. A higher recall indicates the model's ability to retrieve a larger proportion of relevant items.
  • Precision@k: Measures the proportion of recommendations within the top k that are actually relevant. A higher precision indicates the model's ability to surface relevant items early in the ranking.
  • MAP@k (Mean Average Precision@k): Calculates the average precision across multiple queries, considering the order of relevant items within the top k. MAP@k provides a more comprehensive view of ranking quality than precision alone.
  • NDCG@k (Normalized Discounted Cumulative Gain@k): Similar to MAP@k, NDCG@k accounts for the position of relevant items but also considers the relevance scores themselves, giving higher weight to more relevant items appearing at the top.
  • Hit Ratio@k: Represents the percentage of users for whom at least one relevant item is present within the top k recommendations. A high hit ratio signifies the model's effectiveness in satisfying a broad range of user preferences.
  • Coverage@k: Measures the diversity of recommendations by calculating the percentage of unique items recommended across all users within the top k. Higher coverage indicates a wider exploration of your item catalog.
  • Personalization@k: Quantifies the degree of personalization by measuring the dissimilarity of recommendations across different users. Higher personalization suggests that the model tailors recommendations to individual user preferences rather than providing generic results.
  • Average Popularity@k: Provides insights into the model's tendency to recommend popular items by averaging the popularity scores of items within the top k recommendations.
info

What is k? The k parameter is the number of recommendations considered (e.g., the top 10, 20, etc.). We calculate these metrics across various values of k to assess model performance across different recommendation list sizes.

We calculate these metrics for various user and items segments, including:

  • New users: Evaluates how effectively the model recommends to users with limited interaction history.
  • New items: Assesses the model's ability to surface new or less popular items.
  • Power users: Examines performance for users who engage heavily with your platform.
  • Power items: Analyzes how well the model handles highly popular or trending items.

Segmented analysis reveals model strengths and weaknesses across different user groups and item types.

For information on updating engines with version control and CI/CD, see GitOps flow and patching.