GSASRec (Sequential)
Description
The GSASRec (Generalized Self-Attentive Sequential Recommendation) policy is an enhancement of SASRec designed to mitigate potential overconfidence issues that can arise from negative sampling during training. It incorporates modifications, often to the loss function or sampling strategy, to improve prediction calibration while retaining the core self-attention mechanism of SASRec for modeling sequences.
Policy Type: gsasrec
Supports: embedding_policy, scoring_policy
Hyperparameter tuning
batch_size: Number of samples processed before updating model weights.eval_batch_size: Batch size used during model evaluation.n_epochs: Number of complete passes through the training dataset.devicelearning_rate: Learning rate for gradient descent optimization.weight_decay: L2 regularization term to prevent overfitting.patience: Number of epochs to wait without improvement before early stopping.sequence_length: Maximum length of input sequences.embedding_dim: Dimensionality of item embeddings.num_heads: Number of attention heads in the transformer.num_blocks: Number of transformer blocks.dropout_rate: Dropout probability to prevent overfitting.reuse_item_embeddingsmax_batches_per_epochgbce_tfilter_ratedneg_per_positive: Negative samples per positive sample.eval_after_epochssplit_ratioeps
V1 API
policy_configs:
scoring_policy: # Can also be used under embedding_policy
policy_type: gsasrec
# Training Hyperparameters
batch_size: 32 # Samples per training batch
n_epochs: 1 # Number of training epochs
learning_rate: 0.001 # Optimizer learning rate
dropout_rate: 0.5 # General dropout rate
neg_per_positive: 1
# Architecture Hyperparameters
num_heads: 1 # Number of self-attention heads
sequence_length: 200 # Maximum input sequence length
embedding_dim: 128 # Dimensionality of embeddings/hidden layers
Usage
Use this model when:
- You're experiencing overconfidence issues with SASRec
- You need better calibrated predictions for sequential recommendations
- You want improved negative sampling strategies
- You need the benefits of SASRec with better training stability
Choose a different model when:
- You don't have sequential data
- You want the simplest sequential model (use SASRec or Item2Vec)
- You need general item similarity (use Two-Tower or ALS)
- You want to leverage item content primarily (use Two-Tower or BeeFormer)
Use cases
- E-commerce next-item prediction with calibrated scores
- Video streaming with improved prediction confidence
- Any sequential recommendation where calibration is critical
Reference
- Li, J., et al. (2021). gSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling. CIKM. (Check citation details - arXiv date is late for CIKM 2021).