Policy Configuration

Shaped allows users to configure the internal models that make up a ranking query. These internal models are referred to as policies, and they enable you to customize how candidate items are retrieved from the internal vector store and how those items are reranked and ordered in the final results. This gives you greater flexibility and control over how your results are ranked.

This guide explains how to specify the scoring_policy and embedding_policy configurations (the two currently configurable policies). It also provides a list of the supported policy types, along with their descriptions and configuration options.

What is an embedding policy?

An embedding policy governs how items (or users) are represented as dense vectors in the system. These vector embeddings capture the inherent characteristics of the items, users, or interactions, enabling the system to retrieve and rank items efficiently. The embeddings are generated based on the chosen embedding_policy which can represent anything from a zero-shot deep learning model, to a fine-tuned collaborative filtering method.

Embedding policies are typically used in the retrieval phase of the ranking process, where the goal is to narrow down a large pool of items to a smaller candidate set that can be scored later. We use approximate nearest neighbor (ANN) search techniques to efficiently find items that are most similar to the query.

Key Concepts:

Embeddings: Low-dimensional representations of items or users
Vector Store: A store that holds precomputed embeddings and allows fast retrieval based on similarity.
Approximate Nearest Neighbor (ANN): A technique used to quickly find items in the vector store that are most similar to the query.

For example, in a Two-Tower model, one tower represents user features and the other tower represents item features, and both are embedded into the same vector space for similarity-based retrieval.

What is a scoring policy?

A scoring policy governs how candidate items (retrieved by the embedding policy) are ranked based on their relevance to the query or user. Once a candidate set is generated during the retrieval phase, the scoring policy assigns a relevance score to each item based on predefined criteria or objectives, such as maximizing user engagement, optimizing click-through rates (CTR), or improving other business metrics.

Scoring policies often use machine learning models (such as gradient-boosted trees or neural networks) to predict these relevance scores, and the items are then ordered according to their predicted scores before being returned to the user.

Key Concepts:

Relevance Scoring: Predicting how relevant or important a candidate item is to a specific query or user.
Machine Learning Models: Used to train the scoring function, often leveraging features from user profiles, item metadata, and historical interactions.
Objective Functions: Define what the model optimizes for, such as maximizing engagement or conversion rates.

For instance, a LightGBM scoring policy might use a gradient-boosting algorithm to score and rank items based on features like user behavior, item attributes, and interaction history.

Configuring Your Model Policies

Basic Structure

A typical model configuration follows the structure below, allowing you to specify both the scoring_policy and embedding_policy that govern the model's behavior.

model:
    name: <your-model-name>
    policy_configs:
        scoring_policy:
            policy_type: <scoring-policy-type>
            <policy-kwargs>
        embedding_policy:
            policy_type: <embedding-policy-type>
            <policy-kwargs>

info

By default, both the scoring_policy and embedding_policy are set to auto-tune. This means the system will automatically select the optimal policy and hyperparameters for your model using cross-validation and other tuning techniques. We will explain how to customize this auto-tune behavior later in the guide.

Configuration Example

Here is an example of a model configuration that uses a LightGBM scoring policy and a Two-Tower embedding policy, below that we also show an example using a score ensemble.

model:
    name: <your-model-name>
    policy_configs:
        scoring_policy:
            policy_type: lightgbm
            max_depth: 6
            objective: binary
            num_leaves: 26
            n_estimators: 140
            learning_rate: 0.0019
        embedding_policy:
            policy_type: two-tower
            text_encoding: true
            pool_fn: max

model:
    name: <your-model-name>
    policy_configs:
        scoring_policy:
            policy_type: score-ensemble
            policies:
               - policy_type: lightgbm
                 max_depth: 6
                 objective: binary
               - policy_type: bert4rec
                 hidden_size: 128
        embedding_policy:
            policy_type: two-tower
            pool_fn: max

Policy Catalog

Below we provide a list of supported policy types, along with descriptions and configuration options.

Gradient Boosted Trees

Gradient Boosted Trees are models based on the gradient boosting framework, which builds an ensemble of decision trees to make predictions. These models are widely used in recommendation systems due to their ability to capture complex patterns and interactions in data. A notable ranking algorithm, LambdaMART, can be implemented using LightGBM or XGBoost engines.

LightGBM

Policy Type: lightgbm
Supports: scoring_policy
Description: LightGBM is a popular gradient boosting framework that is optimized for ranking tasks in recommendation systems. It efficiently handles large datasets and is highly customizable, making it ideal for tasks that require fine-tuned control over decision trees.
Key Configurations:
- objective: The loss function to optimize. Supported values include "binary", "regression", "poisson", "lambdarank", and "rank_xendcg".
- n_estimators: Number of boosting iterations. Defaults to 1000.
- max_depth: Maximum depth for each tree. Defaults to -1 (no limit).
- num_leaves: Maximum number of leaves in each tree. Defaults to 31.
- min_child_weight: Minimum sum of hessian in one leaf. Defaults to 1e-3.
- learning_rate: Shrinkage rate. Defaults to 0.1.
- colsample_bytree: Fraction of columns to sample on each iteration. Defaults to 0.9.
- subsample: Fraction of data to sample on each iteration. Defaults to 0.8.
- subsample_freq: Frequency for bagging. Defaults to 5.
- calibrate: Whether to calibrate the model for output probability. Defaults to False.
Reference: LightGBM: A Highly Efficient Gradient Boosting Decision Tree

policy_configs:
  scoring_policy:
    policy_type: lightgbm
    objective: lambdarank
    n_estimators: 1000
    max_depth: -1
    num_leaves: 31
    min_child_weight: 1e-3
    learning_rate: 0.1
    colsample_bytree: 0.9
    subsample: 0.8
    subsample_freq: 5

XGBoost

Policy Type: xgboost
Supports: scoring_policy
Description: XGBoost is an efficient and scalable implementation of gradient-boosted decision trees, optimized for structured data and ranking tasks. Known for its speed and performance, XGBoost is highly customizable and often yields state-of-the-art results in recommendation systems.
Key Configurations:
- mode: Determines whether the model is used as a "classifier" or "regressor". Defaults to "classifier".
- n_estimators: Number of boosting rounds. Defaults to 100.
- max_depth: Maximum depth of each tree. Defaults to 16.
- max_leaves: Maximum number of leaves per tree. Defaults to 0 (no limit).
- n_jobs: Number of parallel jobs for training. Defaults to -1 (use all available CPUs).
- learning_rate: Step size used for each update. Defaults to 0.2.
- min_child_weight: Minimum sum of instance weights needed in a child. Defaults to 1.
Reference: XGBoost: A Scalable Tree Boosting System

policy_configs:
  scoring_policy:
    policy_type: xgboost
    mode: classifier
    n_estimators: 100
    max_depth: 16
    max_leaves: 0
    n_jobs: -1
    learning_rate: 0.2
    min_child_weight: 1

Neural Scoring Models

Neural scoring models use deep learning architectures to learn complex patterns and interactions in data. They are widely used in recommendation systems for their ability to capture non-linear relationships and high-dimensional feature interactions. These models can process a mix of structured and unstructured data, providing flexibility for a variety of use cases.

Wide & Deep Model

Policy Type: wide-deep
Supports: scoring_policy
Description: The Wide & Deep model combines a wide linear model and a deep neural network to capture both memorization (wide part) and generalization (deep part) capabilities. The wide part of the model is responsible for learning simple feature interactions (e.g., cross-product transformations), while the deep part captures more complex and non-linear interactions. This hybrid approach works well for tasks with rich feature interactions and complex patterns, such as user-item recommendations or click-through rate (CTR) prediction.
Key Configurations:
- wide_features: Features to include in the wide part of the model.
- deep_hidden_units: Number of hidden units in the deep neural network.
- activation_fn: Activation function used in the deep network layers (e.g., ReLU, Sigmoid).
- val_split: Proportion of the dataset to be used for validation during training. Defaults to 0.1.
- n_epochs: Number of training epochs. Defaults to 1.
- num_workers: Number of worker threads used during data loading. Defaults to 0.
Reference: Wide & Deep Learning for Recommender Systems

policy_configs:
  scoring_policy:
    policy_type: wide-deep
    wide_features: [age, occupation, user_interactions]
    deep_hidden_units: [128, 64, 32]
    activation_fn: relu
    val_split: 0.1
    n_epochs: 10
    num_workers: 4

DeepFM (Deep Factorization Machine)

Policy Type: deepfm
Supports: scoring_policy
Description: DeepFM is a neural network-based model that combines the power of Factorization Machines (FM) for capturing second-order feature interactions and a deep neural network (DNN) for modeling higher-order feature interactions. This model excels in tasks where both memorization (through FM) and generalization (through DNN) are important, such as in ad ranking, user recommendations, and product suggestions. DeepFM effectively models both low-order (linear) and high-order (non-linear) interactions between features.
Key Configurations:
- embedding_dim: Dimensionality of the embeddings for the factorization machine component.
- deep_hidden_units: Number of hidden units in the deep neural network.
- activation_fn: Activation function for the deep network layers (e.g., ReLU, Tanh).
- dropout: Dropout rate for regularization to prevent overfitting.
- learning_rate: Learning rate used during optimization.
Reference: DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

policy_configs:
  scoring_policy:
    policy_type: deepfm
    embedding_dim: 10
    deep_hidden_units: [128, 64, 32]
    activation_fn: relu
    dropout: 0.5
    learning_rate: 0.001

Matrix Factorization Models

Matrix factorization techniques are powerful tools for collaborative filtering, as they allow for the decomposition of a large interaction matrix (e.g., user-item interactions) into latent factors. These latent factors represent underlying features that help in predicting unknown user-item interactions, especially when implicit feedback data (such as clicks or views) is used.

ALS (Alternating Least Squares)

Policy Type: als
Supports: embedding_policy, scoring_policy
Description: ALS is a matrix factorization technique that solves for latent factors by alternating between fixing user factors and item factors. This iterative method allows for handling implicit feedback (such as clicks or views) by learning latent factors for both users and items. It is especially suited for recommendation systems where implicit feedback data is abundant. ALS optimizes for these latent factors using regularized least squares, making it scalable and efficient for large-scale datasets.
Key Configurations:
- factors: Number of latent factors to learn (also referred to as rank). Defaults to 10.
- regularization: Regularization parameter (lambda) to avoid overfitting. Defaults to 0.1.
- bm25: Whether to use the BM25 weighting scheme for implicit feedback. Defaults to True.
- bm25_k1: Tuning parameter for BM25 weighting. Defaults to 1.2.
- bm25_b: Another BM25 tuning parameter. Defaults to 0.75.
- use_features: Whether to include additional features in the factorization process. Defaults to False.
Reference: Collaborative Filtering for implicit Feedback Datasets

policy_configs:
  scoring_policy:
    policy_type: als
    factors: 10
    regularization: 0.1
    bm25: true
    bm25_k1: 1.2
    bm25_b: 0.75
    use_features: false

SVD (Singular Value Decomposition)

Policy Type: svd
Supports: embedding_policy, scoring_policy
Description: SVD is a matrix factorization technique that decomposes the interaction matrix into latent factors, similar to ALS. It is commonly used for collaborative filtering to learn hidden patterns within user-item interactions. Unlike ALS, SVD uses singular value decomposition to directly compute the factorization, making it effective for explicit and implicit feedback scenarios. It reduces the dimensionality of user-item interaction matrices, representing users and items in a shared latent space.
Key Configurations:
- factors: Number of latent factors to compute. Defaults to 100.
- num_epochs: Number of training epochs for iterative optimization. Defaults to 10.
Reference: Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model

policy_configs:
  scoring_policy:
    policy_type: svd
    factors: 100
    num_epochs: 10

Sequential Models

Sequential models capture the temporal order of user interactions and are used in sequential recommendation tasks. They are designed to model user behavior sequences and predict the next item a user might engage with, making them ideal for applications like video streaming, e-commerce, and content recommendation.

Item2Vec

Policy Type: item2vec
Supports: embedding_policy, scoring_policy
Description: Item2Vec adapts the popular Word2Vec model for recommendation systems by learning vector representations of items such that similar items are close together in the vector space. It captures item co-occurrence patterns, enabling efficient retrieval of similar items. This technique is commonly used for embedding items based on interaction sequences.
Key Configurations:
- embedding_size: The dimensionality of the learned embeddings.
- window_size: The context window size for learning embeddings.
- min_count: Minimum count of item occurrences to consider during training.
- algorithm: The training algorithm to use (e.g., "cbow" for continuous bag of words).
- max_window_size: Maximum window size for context learning.
Reference: Efficient Estimation of Word Representations in Vector Space

policy_configs:
  embedding_policy:
    policy_type: item2vec
    embedding_size: 512
    window_size: 20
    min_count: 1
    algorithm: cbow
    max_window_size: 50
    workers: 1

Behavioral Ngram

Policy Type: ngram
Supports: embedding_policy, scoring_policy
Description: Ngram models capture sequential patterns in interactions by modeling fixed-length sequences of interactions (or n-grams). They are especially effective at capturing co-occurrence and short-term patterns in user behavior sequences, making them useful for session-based recommendations.
Key Configurations:
- n: The number of interactions considered in the n-gram sequence.
- laplace_smoothing: Smoothing factor applied to avoid zero probabilities.
Reference: Word n-gram language model

policy_configs:
  scoring_policy:
    policy_type: ngram
    n: 3
    laplace_smoothing: 0.05

BERT4Rec

Policy Type: bert4rec
Supports: embedding_policy, scoring_policy
Description: BERT4Rec is a bidirectional Transformer-based model designed for sequential recommendation. It predicts future user interactions by modeling entire sequences of past behaviors, allowing for a deeper understanding of the user’s preferences by considering both past and future context. BERT4Rec’s attention-based structure enables it to capture complex temporal dynamics.
Key Configurations:
- hidden_size: Size of hidden layers in the Transformer.
- n_heads: Number of attention heads in each Transformer layer.
- n_layers: Number of Transformer layers.
- learning_rate: Learning rate for optimization.
- dropout_rate: Dropout rate for regularization.
- max_seq_length: Maximum length of interaction sequences.
Reference: BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations

policy_configs:
  scoring_policy:
    policy_type: bert4rec
    batch_size: 1000
    n_epochs: 1
    negative_samples_count: 2
    hidden_size: 64
    n_heads: 2
    n_layers: 2
    learning_rate: 0.001
    dropout_rate: 0.2
    max_seq_length: 50

SASRec (Self-Attentive Sequential Recommendation)

Policy Type: sasrec
Supports: embedding_policy, scoring_policy
Description: SASRec leverages the Transformer architecture to model user behavior sequences using self-attention. This allows it to capture both short-term and long-term dependencies in interaction patterns. SASRec predicts the next item a user will engage with by focusing on previous interactions, making it suitable for sequential recommendation tasks.
Key Configurations:
- batch_size: Batch size for training. Defaults to 1000.
- n_epochs: Number of epochs for training.
- negative_samples_count: Number of negative samples for contrastive learning.
- hidden_size: Size of hidden layers.
- n_heads: Number of attention heads in the self-attention mechanism.
- n_layers: Number of Transformer layers.
- learning_rate: Learning rate for optimization.
- attn_dropout_prob: Dropout rate for the attention mechanism.
- hidden_act: Activation function for hidden layers.
- max_seq_length: Maximum sequence length for the model.
Reference: SASRec: Self-Attentive Sequential Recommendation

policy_configs:
  scoring_policy:
    policy_type: sasrec
    batch_size: 1000
    n_epochs: 1
    negative_samples_count: 2
    hidden_size: 64
    n_heads: 2
    n_layers: 2
    learning_rate: 0.001
    attn_dropout_prob: 0.2
    hidden_act: gelu
    max_seq_length: 50

GSASRec (Generalized SASRec)

Policy Type: gsasrec
Supports: embedding_policy, scoring_policy
Description: GSASRec is an improved version of the SASRec model, designed to mitigate the problem of overconfidence in recommendation models trained with negative sampling.
Key Configurations:
- batch_size: Batch size for training.
- n_epochs: Number of training epochs.
- num_heads: Number of attention heads in the self-attention mechanism.
- sequence_length: Maximum length of interaction sequences.
- learning_rate: Learning rate for optimization.
- dropout_rate: Dropout rate for regularization.
- embedding_dim: Dimensionality of item/user embeddings.
Reference: gSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling

policy_configs:
  scoring_policy:
    policy_type: gsasrec
    batch_size: 32
    n_epochs: 1
    num_heads: 1
    sequence_length: 200
    learning_rate: 0.001
    embedding_dim: 128
    dropout_rate: 0.5

Neural Retrieval

Neural retrieval models are designed to efficiently retrieve items from a large pool of candidates. They are used in recommendation systems to find the most relevant items for a given user or query. Neural retrieval techniques are used during the first stage of the ranking pipeline, where scalability and the ability to handle high-dimensional data are critical.

Two-Tower Model

Policy Type: two-tower
Supports: embedding_policy
Description: The Two-Tower model consists of two separate neural networks (or "towers")—one for users and one for items. Both the user and item embeddings are projected into a shared embedding space, where their similarity can be computed using metrics such as cosine similarity or dot product. This enables efficient nearest neighbor search for large-scale retrieval tasks, allowing the system to quickly identify relevant items for a given user. The Two-Tower model is commonly used in real-time retrieval systems, where both scalability and fast retrieval are essential.
Key Configurations:
- batch_size: Number of samples processed in one training batch. Defaults to 32.
- n_epochs: Number of training epochs.
- negative_samples_count: Number of negative samples to use for contrastive learning.
- embedding_dims: Dimensionality of the embedding space for both the user and item representations. Defaults to 8.
- activation_fn: Activation function used in the hidden layers (e.g., ReLU, Sigmoid).
- dropout: Dropout rate for regularization to prevent overfitting.
- num_workers: Number of data loading workers for training. Defaults to 0.
- lr: Learning rate for model optimization. Defaults to 0.001.
- weight_decay: Regularization term to avoid overfitting. Defaults to 0.0005.
- use_item_ids_as_features: Whether to use item IDs as input features. Defaults to True.
- strategy: Training strategy used for optimization (e.g., early stopping). Defaults to TrainingStrategy.EARLY_STOPPING.
- patience: Number of epochs with no improvement before stopping early. Defaults to 5.
Reference: Deep Neural Networks for YouTube Recommendations

policy_configs:
  embedding_policy:
    policy_type: two-tower
    batch_size: 32
    n_epochs: 1
    negative_samples_count: 2
    embedding_dims: 8
    activation_fn: relu
    dropout: 0.2
    num_workers: 0
    lr: 0.001
    weight_decay: 0.0005
    patience: 5

beeFormer Model

Policy Type: beeformer
Supports: embedding_policy, scoring_policy
Description: BeeFormer is a sentence Transformer-based recommendation model that integrates semantic text understanding with interaction-aware learning. Unlike traditional sentence Transformers that only model semantic similarity, BeeFormer leverages user-item interaction data during training, enabling it to learn patterns specific to recommender systems. This dual optimization allows the model to excel in cold-start, zero-shot, and cross-domain recommendation scenarios.

BeeFormer is particularly suited for applications where textual item metadata (like descriptions, titles, or reviews) is abundant, but interaction history is sparse. It can also generalize across datasets, offering the potential to build universal text-based recommender models.

Key Configurations:
- batch_size: Number of samples per training batch. Defaults to 1.
- epochs: Total number of training epochs. Defaults to 1.
- lr: Learning rate for optimization. Defaults to 1e-5.
- max_output: Number of negative samples per training instance (hyperparameter). Defaults to 1.
- top_k: Restrict optimization to the top-k predicted outputs. Defaults to 0 (no restriction).
- embedder_batch_size: Batch size for bert or other embedder. Defaults to 100.
- max_seq_length: Maximum sequence length for tokenized input. Defaults to 384.
- use_images: Whether to incorporate image features. Defaults to False.
- train_distributed: Train on multiple devices. Defaults to False.
- rating_threshold: Threshold to binarize ratings. Defaults to 4.0.
- store_embeddings_at_train: Whether to cache embeddings at training. Defaults to True.
Reference: beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems

policy_configs:
  embedding_policy:
    policy_type: beeformer
    batch_size: 1
    epochs: 1
    lr: 1e-5
    max_output: 1
    top_k: 0
    embedder_batch_size: 100
    max_seq_length: 384
    use_images: false
    train_distributed: false
    rating_threshold: 4.0
    store_embeddings_at_train: true

Autoencoder Models

Autoencoder models are widely used for dimensionality reduction and feature learning. These models consist of an encoder and a decoder, where the encoder transforms the input data into a lower-dimensional representation (embedding), and the decoder attempts to reconstruct the original data from this embedding. Autoencoders are valuable in recommendation systems for identifying latent features, capturing complex relationships, and reducing the dimensionality of large datasets.

ELSA

Policy Type: elsa
Supports: embedding_policy, scoring_policy
Description: ELSA is a scalable shallow autoencoder specifically designed for implicit feedback recommendation systems. Unlike earlier models like EASE, which struggle to scale with large interaction matrices, ELSA introduces a novel approach by factorizing its hidden layer into a low-rank plus sparse structure
Key Configurations:
- batch_size: Number of samples processed in one batch of training. Defaults to 512.
- n_epochs: Number of epochs for training. Defaults to 20.
- factors: Number of latent factors or embeddings. Defaults to 10.
- lr: Learning rate for the model’s optimization process. Defaults to 0.1.
Reference: There isn't a specific paper dedicated to ELSA, but it extends the ideas from Latent Semantic Analysis (LSA):
- Scalable Linear Shallow Autoencoder for Collaborative Filtering

policy_configs:
  embedding_policy:
    policy_type: elsa
    batch_size: 512
    n_epochs: 20
    factors: 10
    lr: 0.1

EASE (Embarrassingly Shallow Autoencoder)

Policy Type: ease
Supports: embedding_policy, scoring_policy
Description: EASE is a collaborative filtering model that bypasses traditional neural autoencoders by using a closed-form solution to learn item-item interactions. Unlike standard autoencoders, EASE does not require iterative optimization or non-linear layers. EASE is particularly suited for tasks where item similarity is key, such as product recommendations and personalized content feeds.
Key Configurations:
- batch_size: Number of samples processed in one batch of training. Defaults to 512.
- n_epochs: Number of epochs for training. Defaults to 20.
- factors: Number of latent factors or embeddings. Defaults to 10.
- lr: Learning rate for the model’s optimization process. Defaults to 0.1.
Reference: Embarrassingly Shallow Autoencoders for Sparse Data

policy_configs:
  embedding_policy:
    policy_type: ease
    batch_size: 512
    n_epochs: 20
    factors: 10
    lr: 0.1

Content-Based Similarity Models

Content-based similarity models focus on recommending items based on their attributes (for items) or the alignment of user preferences with item content (for users). These models are particularly effective when rich metadata or feature sets are available for both users and items.

Item-Content Similarity

Policy Type: item-content-similarity
Supports: embedding_policy, scoring_policy
Description: The Item Content Similarity model computes item embeddings based on their item attributes, e.g. text is encoded as a text embedding, categories are encoded as one-hot vectors. The user embedding is computed by pooling the item embeddings of items the user has interacted with.
Key Configurations:
- pool_fn: Strategy for pooling item embeddings (e.g., MAX, MEAN).
- distance_fn: Distance function used to compute similarity (e.g., dot, cosine).

policy_configs:
  scoring_policy:
    policy_type: item-content-similarity
    pool_fn: max
    distance_fn: dot

User-Content Similarity

Policy Type: user-content-similarity
Supports: embedding_policy, scoring_policy
Description: The User Content Similarity model computes user embeddings based on user attributes (e.g., text encoded as text embeddings, categories encoded as one-hot vectors). The item embeddings are computed by aggregating or pooling the user embeddings from users who have interacted with the item.
Key Configurations:
- pool_fn: Strategy for pooling user embeddings (e.g., MAX, MEAN).
- distance_fn: Distance function used to compute similarity (e.g., dot, cosine).

policy_configs:
  scoring_policy:
    policy_type: user-content-similarity
    pool_fn: max
    distance_fn: dot

User-Item Content Similarity

Policy Type: user-item-content-similarity
Supports: scoring_policy
Description: The User-Item Content Similarity policy computes similarities by considering both user attributes and item attributes simultaneously. This policy computes user embeddings directly from user attributes and item embeddings directly from item attributes. It is useful when both the user and item attributes have an aligned context, e.g. an interests attribute for the user and a tags attribute for the item.
Key Configurations:
- pool_fn: Strategy for pooling item embeddings (e.g., MAX, MEAN).
- distance_fn: Distance function used to compute similarity (e.g., dot, cosine).

policy_configs:
  scoring_policy:
    policy_type: user-item-content-similarity
    pool_fn: max
    distance_fn: dot

Rule-Based Models

Rule-based policies provide non-machine-learning-based methods for recommendation and retrieval. These models rely on predefined rules, such as random selection, popularity, or recency, and are often used as baselines or in specific use cases where simple heuristics suffice.

Random

Policy Type: random
Supports: scoring_policy
Description: The Random model selects items randomly from the pool of available candidates. It is commonly used for testing or introducing exploration into the recommendation process, ensuring diverse and unpredictable recommendations.

policy_configs:
  scoring_policy:
    policy_type: random

Chronological

Policy Type: chronological
Supports: scoring_policy
Description: The Chronological model ranks items based on their timestamp, prioritizing either the most recent or the oldest items. This model is especially useful in situations where recency is important, such as in news or event-based recommendation systems.
Key Configurations:
- time_col: Specifies the column in the data that contains the timestamp or date to be used for chronological ordering. Defaults to None, meaning the system will use a default time column if available.
- ascending: Determines the sort order. If True, items are sorted in ascending order (oldest to newest). Defaults to False (newest to oldest).

policy_configs:
  scoring_policy:
    policy_type: chronological
    time_col: "interaction_time"
    ascending: false

Policy Type: popular
Supports: scoring_policy
Description: The Popular model, often referred to as a Toplist, ranks items based on their overall popularity. Popularity is by pooling the label column (e.g., clicks, views, or purchases). This model is especially useful in cold-start scenarios or when popularity is a strong indicator of user interest. You can also apply time-based filters to focus on recent interactions.
Key Configurations:
- mode: Defines pooling method used for popularity (e.g., "sum").
- time_window_in_days: (Optional) Filters interactions to only consider those within a specified time window (e.g., last 30 days).

policy_configs:
  scoring_policy:
    policy_type: popular
    mode: sum
    time_window_in_days: 30

Policy Type: rising-popularity
Supports: scoring_policy
Description: This trending policy ranks items based on their recent trend in popularity, focusing on items that are rapidly gaining engagement. Unlike the Popular model, which ranks items based on overall interaction counts, the Trending model emphasizes momentum over a specific time period, making it ideal for surfacing content like breaking news or viral videos.
Key Configurations:
- time_window: The time window (e.g., last 7 days) over which the trending score is calculated.
- time_frequency: Frequency of time window updates (e.g., daily, hourly).

policy_configs:
  scoring_policy:
    policy_type: rising-popularity
    time_window: 7
    time_frequency: 1D

Policy Type: recently-popular
Supports: scoring_policy
Description: This trending policy ranks items based on popularity but applies a decay factor to older interactions. The policy focuses on items that are both trending and relatively new, making it useful in fast-moving environments where freshness is key.
Key Configurations:
- mode: Defines the trending mode to use, such as HACKERNEWS or REDDIT.

policy_configs:
  scoring_policy:
    policy_type: recently-popular
    mode: hackernews

Special

This category includes special policies that provide unique functionality or combine multiple models to enhance performance. These policies often perform higher-level tasks such as auto-tuning or score aggregation, allowing for more sophisticated recommendation strategies.

Auto-Tune

Policy Type: auto-tune
Supports: embedding_policy, scoring_policy
Description: The Auto-Tune policy automatically selects the best model policy and optimizes hyperparameters for your data using cross-validation techniques. This policy can be used to fine-tune a single model policy or to choose between multiple policies based on performance metrics, making it an excellent choice for users who want to optimize their recommendation models without manually selecting the best configuration.
Key Configurations:
- policies: A list of model policies to consider during the auto-tuning process. The system will automatically evaluate these policies and select the best one based on predefined performance criteria (e.g., NDCG, precision).

policy_configs:
  scoring_policy:
    policy_type: auto-tune
    policies:
     - lightgbm
     - wide-deep
     - bert4rec

Score Ensemble

Policy Type: score-ensemble
Supports: scoring_policy
Description: The Score Ensemble policy combines the outputs of multiple model policies interleaving the results. This allows for more robust recommendations by leveraging the diverse advantages of different models. For example, a combination of a content-based model and a collaborative filtering model might offer more balanced recommendations.
Key Configurations:
- policies: A list of scoring policies to include in the ensemble. The system will combine their scores to produce a final recommendation score.
Reference: Ensemble methods are widely used in machine learning and have been studied extensively. For more information, see:
- Ensemble Learning Techniques

policy_configs:
  scoring_policy:
    policy_type: score-ensemble
    policies:
       - policy_type: lightgbm
         max_depth: 6
         objective: binary
       - policy_type: bert4rec
         hidden_size: 128

No-Operation

Policy Type: no-op
Description: The No-Operation (No-op) policy is a placeholder that does not perform any computation, transformation, or prediction. It is typically used for testing, as a way to validate pipelines without applying any scoring or retrieval logic.
Key Configurations:
- None: The No-op policy does not require any specific configurations, as it is simply a placeholder.

policy_configs:
  scoring_policy:
    policy_type: no-op

Policy Configuration

What is an embedding policy?​

What is a scoring policy?​

Configuring Your Model Policies​

Basic Structure​

Configuration Example​

Policy Catalog​

Gradient Boosted Trees​

LightGBM​

XGBoost​

Neural Scoring Models​

Wide & Deep Model​

DeepFM (Deep Factorization Machine)​

Matrix Factorization Models​

ALS (Alternating Least Squares)​

SVD (Singular Value Decomposition)​

Sequential Models​

Item2Vec​

Behavioral Ngram​

BERT4Rec​

SASRec (Self-Attentive Sequential Recommendation)​

GSASRec (Generalized SASRec)​

Neural Retrieval​

Two-Tower Model​

beeFormer Model​

Autoencoder Models​

ELSA​

EASE (Embarrassingly Shallow Autoencoder)​

Content-Based Similarity Models​

Item-Content Similarity​

User-Content Similarity​

User-Item Content Similarity​

Rule-Based Models​

Random​

Chronological​

Popular​

Trending (Rising Popularity)​

Trending (Recently Popular)​

Special​

Auto-Tune​

Score Ensemble​

No-Operation​