DeepFM (Neural Scoring)
Description
The DeepFM (Deep Factorization Machine) policy combines a Factorization Machine (FM) component and a deep neural network (DNN) component, sharing input feature embeddings.
- FM Component: Models low-order interactions (linear features and pairwise/2nd-order interactions) efficiently using the dot product of feature embeddings.
- Deep Component: An MLP that learns high-order, non-linear feature interactions implicitly from the shared embeddings.
This avoids the need for manual feature crossing (unlike the wide part of Wide & Deep) and captures a spectrum of interactions. Outputs from both components are combined for the final prediction.
Policy Type: deepfm
Supports: scoring_policy
Premium Model
This model requires the Standard Plan or higher.
Hyperparameter tuning
embedding_dim: Dimensionality of feature embeddings (shared).deep_hidden_units: Layer sizes for the deep MLP component.activation_fn: Activation for deep layers.dropout: Dropout rate for deep layers.learning_rate: Optimizer learning rate.
V1 API
policy_configs:
scoring_policy:
policy_type: deepfm
# Architecture
embedding_dim: 16 # Dimensionality of feature embeddings (shared)
deep_hidden_units: [128, 64, 32] # Layer sizes for the deep MLP component
activation_fn: "relu" # Activation for deep layers
dropout: 0.2 # Dropout rate for deep layers
# Training Control
learning_rate: 0.001 # Optimizer learning rate
Usage
Use this model when:
- You need a strong CTR or conversion prediction model that can learn both low- and high-order feature interactions
- You have rich categorical and continuous features and want to avoid manual feature crossing
- You care about modeling interaction terms between many sparse features (e.g., user, item, and context features)
- You want a neural scoring model that generalizes better than pure linear or FM-based approaches alone
Choose a different model when:
- You have relatively simple feature sets and want a fast, tree-based baseline (use LightGBM or XGBoost)
- You do not have the infrastructure or appetite to train and serve neural models in production
- You mainly need retrieval-stage embeddings rather than point-wise scoring (use Two-Tower, ALS, or BeeFormer)
- You primarily need to capture sequential patterns in user behavior (use SASRec, BERT4Rec, or other sequential models)
Use cases
- CTR prediction for feeds, recommendations, and advertising with many categorical IDs
- Ranking items in e-commerce or content platforms using rich user, item, and context features
- Scoring candidates in a ranking stage after retrieval from embeddings or search
- Personalization tasks where complex feature interactions drive performance
References
- Guo, H., et al. (2017). DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. IJCAI.
- Wang, R., et al. (2021). DLRM: An advanced open source deep learning recommendation model. arXiv. (Provides context on DLRM-style models).