Choosing a model policy

This guide provides an overview of embedding models available for training in Shaped, helping you choose the right model for vector search and item similarity tasks.

SVD (Singular Value Decomposition)

Model Summary

SVD implements matrix factorization using iterative algorithms inspired by singular value decomposition (like Funk SVD or SVD++). It can incorporate user/item biases and is typically trained using Stochastic Gradient Descent (SGD).

When to Use This Model

You have explicit feedback data (ratings, reviews with scores)
You want to model user and item biases (mean-centering tendencies)
You prefer SGD-based optimization over ALS
You need a straightforward matrix factorization approach
You're working with medium-sized datasets

When to Not Use This Model

You have only implicit feedback (ALS is typically better for this)
You need to leverage item content features (use Two-Tower or BeeFormer)
You want the most scalable solution (ELSA may be better)
You need to model sequential patterns (use sequential models)

Sample Use Cases / Item Types

Movie recommendations with explicit ratings (e.g., MovieLens)
Product recommendations with review scores
Restaurant or service recommendations with ratings
Any domain with explicit user feedback

ELSA (Efficient Latent Sparse Autoencoder)

Model Summary

ELSA is a scalable shallow linear autoencoder for implicit feedback collaborative filtering that learns item-item relationships by reconstructing user interaction vectors. Unlike EASE, it uses a factorized hidden layer structure (low-rank plus sparse) to improve scalability.

When to Use This Model

You have large-scale datasets and need scalability
You're working with implicit feedback data
You want better scalability than EASE while maintaining similar performance
You need efficient item-item similarity computation
You want a modern autoencoder approach for collaborative filtering

When to Not Use This Model

You have very small datasets (EASE might be simpler)
You need to incorporate item content features (use Two-Tower or BeeFormer)
You want to model sequential patterns (use sequential models)
You have explicit feedback only (SVD might be more appropriate)

Sample Use Cases / Item Types

Large e-commerce platforms with millions of products
Content streaming services with extensive catalogs
Social media feed recommendations
Any large-scale implicit feedback recommendation system

EASE (Embarrassingly Shallow Autoencoder)

Model Summary

EASE is a simple linear autoencoder for item-based collaborative filtering that uses a closed-form solution, avoiding iterative optimization for computational efficiency.

When to Use This Model

You want the simplest possible autoencoder approach
You need fast training with a closed-form solution
You have implicit feedback data
You're working with medium-sized datasets
You want a quick baseline or prototype

When to Not Use This Model

You have very large-scale datasets (ELSA scales better)
You need to leverage item content features (use Two-Tower or BeeFormer)
You want to model sequential patterns (use sequential models)
You need the highest accuracy (more complex models may perform better)

Sample Use Cases / Item Types

Medium-sized e-commerce sites
Content recommendation systems
Product similarity for "customers who bought this also bought"
Quick prototypes and baselines

Two-Tower

Model Summary

The Two-Tower architecture separates user and item computation into two distinct neural networks that output embeddings in the same vector space. Item embeddings can be pre-computed offline, enabling efficient Approximate Nearest Neighbor (ANN) search at inference time.

When to Use This Model

You have rich item metadata (text descriptions, categories, images)
You need efficient large-scale vector search and retrieval
You want to leverage both collaborative and content signals
You need production-ready embeddings for real-time recommendations
You want the best performance for general item similarity tasks

When to Not Use This Model

You have only interaction data without item features (use ALS/ELSA)
You need to model strict sequential patterns (use SASRec/BERT4Rec)
You have very limited compute resources
You need a simple baseline model

Sample Use Cases / Item Types

E-commerce with product descriptions, categories, images (e.g., clothing, electronics)
Content platforms with rich metadata (articles, videos with descriptions)
Job recommendations with job descriptions and requirements
Real estate with property descriptions and features
Any domain where items have rich textual or categorical attributes

BeeFormer

Model Summary

BeeFormer fine-tunes pre-trained sentence Transformers using user-item interaction data to bridge semantic similarity (from language models) and interaction-based similarity (from collaborative patterns).

When to Use This Model

You have rich text content (product descriptions, titles, reviews)
You want to combine semantic understanding with behavioral signals
You need good cold-start performance for new items
You want to leverage pre-trained language model knowledge
You have items with detailed textual descriptions

When to Not Use This Model

You have minimal or no text content for items
You only have interaction data without item descriptions
You need the fastest training possible
You have very limited text data quality
You want a pure collaborative filtering approach

Sample Use Cases / Item Types

E-commerce with detailed product descriptions (fashion, furniture, electronics)
Content platforms (articles, blog posts, research papers)
Job boards with detailed job descriptions
Real estate with property descriptions
Books, movies, or media with synopses and reviews
Any domain where semantic understanding of text content matters

Item2Vec

Model Summary

Item2Vec adapts the Word2Vec algorithm (CBOW or Skip-gram) to learn item embeddings from user interaction sequences, capturing co-occurrence patterns within a context window.

When to Use This Model

You have sequential interaction data (user sessions, browsing history)
You want to capture item co-occurrence patterns
You need a simple, efficient sequential embedding approach
You're working with session-based recommendations
You want to model "items frequently viewed together"

When to Not Use This Model

You need to model strict temporal order and dependencies (use SASRec/BERT4Rec)
You want to leverage item content features (use Two-Tower or BeeFormer)
You only have non-sequential interaction data
You need the highest accuracy for sequential recommendations

Sample Use Cases / Item Types

E-commerce browsing sessions (items viewed in same session)
Music playlists and song sequences
Video watching sequences
Shopping cart co-occurrence patterns
Session-based web recommendations

SASRec (Self-Attentive Sequential Recommendation)

Model Summary

SASRec utilizes the Transformer architecture's self-attention mechanism to model user interaction sequences, capturing both short-term and long-range dependencies to predict the next item. It's unidirectional, processing sequences from past to future.

When to Use This Model

You have sequential interaction data with clear temporal order
You need to predict the "next item" in a sequence
You want to capture both short-term and long-term patterns
You need state-of-the-art sequential recommendation performance
You're building session-based or sequential recommendation systems

When to Not Use This Model

You don't have sequential or temporal data
You want general item similarity rather than next-item prediction
You need bidirectional context understanding (use BERT4Rec)
You have very short sequences (Item2Vec might be simpler)
You want to leverage item content features primarily (use Two-Tower)

Sample Use Cases / Item Types

E-commerce next-item prediction (what to buy next)
Video streaming next-video recommendations
Music playlist continuation
News feed article sequences
Gaming item progression recommendations
Any domain with clear sequential user behavior

BERT4Rec

Model Summary

BERT4Rec adapts the bidirectional Transformer architecture (BERT) for sequential recommendation, using bidirectional self-attention to learn context-aware item representations by considering both past and future context.

When to Use This Model

You have sequential data and want bidirectional context understanding
You need richer item representations than unidirectional models
You want to leverage both past and future context in sequences
You're working with sequences where context matters in both directions
You need state-of-the-art sequential recommendation with bidirectional attention

When to Not Use This Model

You need real-time next-item prediction (unidirectional is more natural)
You have very long sequences (computational complexity)
You want the simplest sequential model (Item2Vec or SASRec)
You don't have sequential data
You primarily need general item similarity (use Two-Tower or ALS)

Sample Use Cases / Item Types

Context-aware sequential recommendations
Playlist generation with full context
Reading sequences where future context matters
Educational content sequences
Any sequential recommendation where bidirectional understanding helps

GSASRec (Generalized Self-Attentive Sequential Recommendation)

Model Summary

GSASRec is an enhancement of SASRec designed to mitigate overconfidence issues from negative sampling by improving prediction calibration while retaining the core self-attention mechanism.

When to Use This Model

You're experiencing overconfidence issues with SASRec
You need better calibrated predictions for sequential recommendations
You want improved negative sampling strategies
You need the benefits of SASRec with better training stability
You're working on production systems where calibration matters

When to Not Use This Model

You don't have sequential data
You want the simplest sequential model (use SASRec or Item2Vec)
You need general item similarity (use Two-Tower or ALS)
You're just starting with sequential recommendations (SASRec is simpler)
You want to leverage item content primarily (use Two-Tower or BeeFormer)

Sample Use Cases / Item Types

Production sequential recommendation systems
E-commerce next-item prediction with calibrated scores
Video streaming with improved prediction confidence
Any sequential recommendation where prediction calibration is critical

Quick Decision Guide

For general item similarity and vector search:

Best choice: Two-Tower (if you have item features) or ALS/ELSA (if you only have interactions)

For text-rich items:

Best choice: BeeFormer (combines semantic + behavioral) or Two-Tower (if you have other features too)

For sequential recommendations:

Best choice: SASRec (unidirectional) or BERT4Rec (bidirectional), or Item2Vec (simpler co-occurrence)

For large-scale implicit feedback:

Best choice: ELSA (scalable) or ALS (simple and effective)

For explicit feedback:

Best choice: SVD (with biases) or ALS (can work with explicit too)

SVD (Singular Value Decomposition)​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

ELSA (Efficient Latent Sparse Autoencoder)​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

EASE (Embarrassingly Shallow Autoencoder)​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

Two-Tower​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

BeeFormer​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

Item2Vec​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

SASRec (Self-Attentive Sequential Recommendation)​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

BERT4Rec​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

GSASRec (Generalized Self-Attentive Sequential Recommendation)​

Model Summary​

When to Use This Model​

When to Not Use This Model​

Sample Use Cases / Item Types​

Quick Decision Guide​

SVD (Singular Value Decomposition)

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

ELSA (Efficient Latent Sparse Autoencoder)

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

EASE (Embarrassingly Shallow Autoencoder)

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

Two-Tower

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

BeeFormer

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

Item2Vec

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

SASRec (Self-Attentive Sequential Recommendation)

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

BERT4Rec

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

GSASRec (Generalized Self-Attentive Sequential Recommendation)

Model Summary

When to Use This Model

When to Not Use This Model

Sample Use Cases / Item Types

Quick Decision Guide