BeeFormer (Neural Retrieval)
This is an article from the Shaped 1.0 documentation. The APIs have changed and information may be outdated. Go to Shaped 2.0 docs
Description
The beeFormer policy implements a recommendation model that fine-tunes pre-trained sentence Transformers using user-item interaction data. It aims to bridge the gap between purely semantic similarity (from the language model) and interaction-based similarity (from collaborative patterns). It leverages a base sentence Transformer as an encoder and often uses an interaction-based loss (conceptually similar to ELSA's reconstruction loss) to update the Transformer's weights. This produces embeddings that reflect both content meaning and behavioral patterns, potentially improving cold-start performance and enabling knowledge transfer.
Policy Type: beeformer
Supports: embedding_policy, scoring_policy
Configuration Example
policy_configs:
embedding_policy: # Or scoring_policy
policy_type: beeformer
max_seq_length: 384 # Max sequence length for the transformer input
# Training Hyperparameters
batch_size: 1 # Samples per training batch (may use gradient accumulation internally)
epochs: 1 # Total training epochs
lr: 1e-5 # Learning rate (typically small for fine-tuning)
max_output: 1 # Related to negative sampling/loss calculation (from source doc)
top_k: 0 # Restrict optimization (from source doc, 0=no restriction)
embedder_batch_size: 100 # Internal batch size for base transformer inference
References
- Vančura, V., Kordík, P., & Straka, M. (2024). beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems. RecSys '24 / arXiv.
- Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. EMNLP.
- Vančura, V., et al. (2022). Scalable Linear Shallow Autoencoder for Collaborative Filtering. WSDM. (ELSA paper, relevant to beeFormer's loss).