Skip to main content

Ngram (Sequential)

warning

This is an article from the Shaped 1.0 documentation. The APIs have changed and information may be outdated. Go to Shaped 2.0 docs

Description

The Ngram policy implements a simple, frequency-based sequential model. It predicts the next item based on the conditional probability derived from counts of the immediately preceding n-1 items (the n-gram context). It's effective at capturing short-term co-occurrence patterns in user behavior sequences.

Policy Type: ngram Supports: scoring_policy

Configuration Example

policy_configs:
scoring_policy:
policy_type: ngram
n: 3 # Sequence length (e.g., 3 for trigrams - uses last 2 items)
laplace_smoothing: 0.05 # Smoothing factor to handle unseen sequences (avoids zero probability)

References

  • Concept based on N-gram language models. See: Jurafsky, D., & Martin, J. H. (2009). Speech and Language Processing. Prentice Hall.
  • Wikipedia: Word n-gram language model