LightGBM (GBT)

Description

The LightGBM policy utilizes the LightGBM framework, a highly efficient gradient boosting implementation. It builds an ensemble of decision trees sequentially, where each new tree corrects errors made by the previous ones. It's optimized for speed and memory usage, handles large datasets well, and supports various objectives, including classification, regression, and specialized learning-to-rank objectives like LambdaRank (implementing LambdaMART).

Policy Type: lightgbm Supports: scoring_policy

Premium Model

This model requires the Standard Plan or higher.

Hyperparameter tuning

event_values: List of event value strings to filter interactions by.
objective: Objective function (regression, binary, lambdarank, rank_xendcg).
n_estimators: Number of boosting iterations.
max_depth: Maximum depth of the tree. -1 means no limit.
num_leaves: Maximum number of leaves in one tree.
min_child_weight: Minimum sum Hessian in one leaf.
learning_rate: Learning rate (shrinkage) for gradient boosting.
colsample_bytree: Subsample columns on each iteration.
subsample: Subsample training data on each iteration.
subsample_freq: Bagging frequency.
zero_as_missing: Treat zero values as missing.
bin_construct_sample_cnt: Number of samples used to construct bins.
verbose
verbose_eval
num_threads
enable_resume: Whether to enable resume functionality.
lambdarank_truncation_level: Number of pairs used in pairwise loss. Should be set to slightly higher than the k value used for NDCG@k.
calibrate: Whether to calibrate output probabilities.
event_value_user_affinity_features: Whether to use event value user affinity features.
event_value_affinity_features_value_filter
rolling_window_hours
negative_affinity_features: Whether to use negative affinity features.
content_affinity_features: Whether to use content affinity features.
content_affinity_features_batch_size: Batch size for content affinity features.
content_affinity_max_num_latest_items
container_categorical_to_multi_hot: Whether to convert container categorical to multi-hot.
container_to_container_affinities: Whether to use container to container affinities.
point_in_time_item_feature: Whether to use point in time item feature.
drop_user_id: Whether to drop user ID.
drop_item_id: Whether to drop item ID.
early_stopping_rounds: Number of rounds for early stopping.

V1 API

policy_configs:
  scoring_policy:
    policy_type: lightgbm
    # Core Parameters
    objective: "lambdarank" # Key for LTR: "binary", "regression", "lambdarank", "rank_xendcg"
    n_estimators: 1000     # Number of trees (boosting rounds)
    learning_rate: 0.05    # Step size shrinkage
    # Tree Structure Parameters
    max_depth: 8           # Max tree depth (-1 for no limit)
    num_leaves: 31         # Max leaves per tree (consider tuning based on data)
    min_child_weight: 1e-3 # Minimum sum of instance weight (hessian) needed in a child
    # Regularization & Sampling Parameters
    colsample_bytree: 0.8  # Fraction of features considered per tree
    subsample: 0.8         # Fraction of data sampled per boosting iteration (bagging)
    subsample_freq: 5      # Frequency for bagging (0 means disabled)
    # Other Parameters
    calibrate: false       # Calibrate output probabilities (usually for binary objective)

Usage

Use this model when:

You have large-scale datasets with millions of rows and many categorical features
You need fast training and inference for ranking or CTR prediction workloads
You want a learning-to-rank objective such as LambdaRank/LambdaMART for slate ranking
You need a production-ready, memory-efficient GBDT implementation with strong baseline performance
You want to combine rich feature sets (behavioral, content, and affinity features) into a single scoring model

Choose a different model when:

You primarily work with medium-sized datasets and prefer stronger regularization defaults (use XGBoost)
You want to explicitly learn high-order feature interactions via deep networks (use DeepFM or Wide & Deep)
You have extremely sparse interaction-only data and no rich feature set (use ALS/ELSA or other embedding models)
You need sequence-aware modeling of user behavior (use SASRec, BERT4Rec, or other sequential models)
You only need a simple trending or heuristic baseline (use Rising Popularity or value-model expressions)

Use cases

E-commerce CTR prediction and ranking for product search and browse results
Feed ranking and homepage personalization using mixed behavioral and content features
Ad ranking and sponsored content placement where learning-to-rank objectives are important
Re-ranking of retrieved candidates in multi-stage recommendation architectures
Any tabular recommendation or ranking problem with a wide variety of engineered features

References

Ke, G., et al. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. NeurIPS.
Burges, C. J. C. (2010). From RankNet to LambdaRank to LambdaMART: An Overview. Microsoft Research Technical Report. (LambdaMART explanation).

Description​

Hyperparameter tuning​

Usage​

Use this model when:​

Choose a different model when:​

Use cases​

References​