Two-Tower (Neural Retrieval)

Description

The Two-Tower model architecture separates the computation for users and items into two distinct neural networks ("towers") to efficiently generate embeddings for large-scale retrieval.

User Tower: Processes user-related features (ID, demographics, interaction history, context) to output a user embedding u.
Item Tower: Processes item-related features (ID, metadata, content features) to output an item embedding v in the same vector space.

Affinity is typically calculated using a simple similarity function (dot product or cosine similarity) between u and v.

Its primary strength lies in decoupling computation for serving: item embeddings can be pre-computed offline, and Approximate Nearest Neighbor (ANN) search can be used at inference time to efficiently retrieve candidate items whose embeddings are most similar to a real-time computed user embedding. This makes it highly suitable for the candidate generation (retrieval) stage in multi-stage recommendation systems.

While powerful for retrieval, the standard Two-Tower design limits explicit modeling of cross-feature interactions between users and items until the final similarity calculation.

Policy Type: two-tower Supports: embedding_policy

Configuration Example

embedding_policy_two_tower.yaml
policy_configs:
  embedding_policy:
    policy_type: two-tower
    # Training Hyperparameters
    batch_size: 32          # Samples per training batch
    n_epochs: 5             # Number of training epochs
    negative_samples_count: 5 # Negative samples per positive for contrastive loss
    lr: 0.001               # Learning rate
    weight_decay: 0.0005    # L2 regularization strength
    patience: 5             # Epochs for early stopping patience
    # Architecture Hyperparameters
    embedding_dims: 128     # Dimensionality of the shared embedding space (u and v)
    activation_fn: "relu"   # Activation function in hidden layers (e.g., "relu", "gelu")
    dropout: 0.2            # Dropout rate for regularization

References

Covington, P., Adams, J., & Sargin, E. (2016). Deep Neural Networks for YouTube Recommendations. RecSys.
Ying, R., et al. (2018). Graph Convolutional Neural Networks for Web-Scale Recommender Systems. KDD. (Example: PinSage used GNNs within towers).
Huang, J., et al. (2020). Embedding-based Retrieval in Facebook Search. KDD. (Discusses practical aspects like negative sampling).

Two-Tower (Neural Retrieval)

Description​

Configuration Example​

References​

Description

Configuration Example

References