Skip to main content

GSASRec (Sequential)

Description

The GSASRec (Generalized Self-Attentive Sequential Recommendation) policy is an enhancement of SASRec designed to mitigate potential overconfidence issues that can arise from negative sampling during training. It incorporates modifications, often to the loss function or sampling strategy, to improve prediction calibration while retaining the core self-attention mechanism of SASRec for modeling sequences.

Policy Type: gsasrec Supports: embedding_policy, scoring_policy

Hyperparameter tuning

  • batch_size: Number of samples processed before updating model weights.
  • eval_batch_size: Batch size used during model evaluation.
  • n_epochs: Number of complete passes through the training dataset.
  • device
  • learning_rate: Learning rate for gradient descent optimization.
  • weight_decay: L2 regularization term to prevent overfitting.
  • patience: Number of epochs to wait without improvement before early stopping.
  • sequence_length: Maximum length of input sequences.
  • embedding_dim: Dimensionality of item embeddings.
  • num_heads: Number of attention heads in the transformer.
  • num_blocks: Number of transformer blocks.
  • dropout_rate: Dropout probability to prevent overfitting.
  • reuse_item_embeddings
  • max_batches_per_epoch
  • gbce_t
  • filter_rated
  • neg_per_positive: Negative samples per positive sample.
  • eval_after_epochs
  • split_ratio
  • eps

V1 API

policy_configs:
scoring_policy: # Can also be used under embedding_policy
policy_type: gsasrec
# Training Hyperparameters
batch_size: 32 # Samples per training batch
n_epochs: 1 # Number of training epochs
learning_rate: 0.001 # Optimizer learning rate
dropout_rate: 0.5 # General dropout rate
neg_per_positive: 1
# Architecture Hyperparameters
num_heads: 1 # Number of self-attention heads
sequence_length: 200 # Maximum input sequence length
embedding_dim: 128 # Dimensionality of embeddings/hidden layers

Usage

Use this model when:

  • You're experiencing overconfidence issues with SASRec
  • You need better calibrated predictions for sequential recommendations
  • You want improved negative sampling strategies
  • You need the benefits of SASRec with better training stability

Choose a different model when:

  • You don't have sequential data
  • You want the simplest sequential model (use SASRec or Item2Vec)
  • You need general item similarity (use Two-Tower or ALS)
  • You want to leverage item content primarily (use Two-Tower or BeeFormer)

Use cases

  • E-commerce next-item prediction with calibrated scores
  • Video streaming with improved prediction confidence
  • Any sequential recommendation where calibration is critical

Reference