XGBoost (GBT)
Description
The XGBoost policy utilizes the XGBoost library, another popular and powerful gradient boosting framework known for its performance, regularization options, and scalability. Like LightGBM, it builds an ensemble of decision trees. It can be configured for classification, regression, or ranking tasks.
Policy Type: xgboost
Supports: scoring_policy
Premium Model
This model requires the Standard Plan or higher.
Hyperparameter tuning
event_values: List of event value strings to filter interactions by.mode: Model mode (regressor or classifier).n_estimators: Number of boosting iterations (trees).max_depth: Maximum depth of the tree. -1 means no limit.max_leaves: Maximum number of leaves in one tree.n_jobslearning_rate: Learning rate (shrinkage) for gradient boosting.min_child_weight: Minimum sum of instance weight needed in a child node.use_user_ids_as_features: Whether to use user IDs as features.use_item_ids_as_features: Whether to use item IDs as features.
V1 API
policy_configs:
scoring_policy:
policy_type: xgboost
# Core Parameters
mode: "classifier" # Model mode: "classifier" or "regressor"
objective: "binary:logistic" # Example objective (verify available ranking objectives if needed)
n_estimators: 100 # Number of boosting rounds
learning_rate: 0.2 # Step size shrinkage (eta)
# Tree Structure Parameters
max_depth: 16 # Max depth per tree
max_leaves: 0 # Max leaves per tree (0 = no limit)
min_child_weight: 1 # Minimum sum of instance weight needed in a child
Usage
Use this model when:
- You work with structured/tabular data and need a strong, regularized baseline model
- You have small to medium-sized datasets where XGBoost’s conservative defaults are beneficial
- You need robust handling of noisy data, missing values, and mixed feature types
- You want a well-understood, battle-tested GBDT framework for CTR prediction or ranking
- You care about interpretability via tree-based feature importance and decision paths
Choose a different model when:
- You need maximum training and inference speed on very large datasets (use LightGBM)
- You want to explicitly model high-order feature interactions via deep networks (use DeepFM or Wide & Deep)
- You primarily have sparse interaction-only signals without rich features (use ALS/ELSA or other embedding models)
- You need sequence-aware modeling of user behavior (use SASRec, BERT4Rec, or other sequential models)
- You only need a simple trending or heuristic baseline (use Rising Popularity or value-model expressions)
Use cases
- CTR prediction for recommendation or advertising with moderate data sizes
- Ranking search and browse results using handcrafted and learned features
- Scoring candidates in multi-stage recommendation pipelines where robustness matters
- A/B testing and prototyping of feature sets for downstream ranking systems
- Any structured prediction task where tree ensembles are a good fit
Reference
- Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. KDD.