Skip to main content

Two-Tower (Neural Retrieval)

Description

The Two-Tower model architecture separates the computation for users and items into two distinct neural networks ("towers") to efficiently generate embeddings for large-scale retrieval.

  • User Tower: Processes user-related features (ID, demographics, interaction history, context) to output a user embedding u.
  • Item Tower: Processes item-related features (ID, metadata, content features) to output an item embedding v in the same vector space.

Affinity is typically calculated using a simple similarity function (dot product or cosine similarity) between u and v.

Its primary strength lies in decoupling computation for serving: item embeddings can be pre-computed offline, and Approximate Nearest Neighbor (ANN) search can be used at inference time to efficiently retrieve candidate items whose embeddings are most similar to a real-time computed user embedding. This makes it highly suitable for the candidate generation (retrieval) stage in multi-stage recommendation systems.

While powerful for retrieval, the standard Two-Tower design limits explicit modeling of cross-feature interactions between users and items until the final similarity calculation.

Policy Type: two-tower Supports: embedding_policy

Configuration Example

embedding_policy_two_tower.yaml
policy_configs:
embedding_policy:
policy_type: two-tower
# Training Hyperparameters
batch_size: 32 # Samples per training batch
n_epochs: 5 # Number of training epochs
negative_samples_count: 5 # Negative samples per positive for contrastive loss
lr: 0.001 # Learning rate
weight_decay: 0.0005 # L2 regularization strength
patience: 5 # Epochs for early stopping patience
# Architecture Hyperparameters
embedding_dims: 128 # Dimensionality of the shared embedding space (u and v)
activation_fn: "relu" # Activation function in hidden layers (e.g., "relu", "gelu")
dropout: 0.2 # Dropout rate for regularization

References