BERT4Rec (Sequential)

info

This is a preview of the new Shaped docs. Found an issue or have feedback? Let us know!

Description

The BERT4Rec policy adapts the bidirectional Transformer architecture (BERT) for sequential recommendation. Unlike unidirectional models (like SASRec), it uses bidirectional self-attention and is typically trained using a masked item prediction objective (predicting masked items based on both past and future context within the sequence). This allows it to learn rich, context-aware item representations.

Policy Type: bert4rec Supports: embedding_policy, scoring_policy

Configuration Example

scoring_policy_bert4rec.yaml
policy_configs:
  scoring_policy: # Can also be used under embedding_policy
    policy_type: bert4rec
    # Training Hyperparameters
    batch_size: 1000           # Samples per training batch
    n_epochs: 1                # Number of training epochs
    negative_samples_count: 2  # Negative samples (often relevant for loss calculation)
    learning_rate: 0.001       # Optimizer learning rate
    dropout_rate: 0.2          # General dropout rate for regularization
    # Architecture Hyperparameters
    hidden_size: 64            # Dimensionality of hidden layers/embeddings
    n_heads: 2                 # Number of self-attention heads
    n_layers: 2                # Number of Transformer layers
    max_seq_length: 50         # Maximum input sequence length

Reference

Sun, F., et al. (2019). BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. CIKM.

BERT4Rec (Sequential)

Description​

Configuration Example​

Reference​

Description

Configuration Example

Reference