DeepFM (Neural Scoring)

Description

The DeepFM (Deep Factorization Machine) policy combines a Factorization Machine (FM) component and a deep neural network (DNN) component, sharing input feature embeddings.

FM Component: Models low-order interactions (linear features and pairwise/2nd-order interactions) efficiently using the dot product of feature embeddings.
Deep Component: An MLP that learns high-order, non-linear feature interactions implicitly from the shared embeddings.

This avoids the need for manual feature crossing (unlike the wide part of Wide & Deep) and captures a spectrum of interactions. Outputs from both components are combined for the final prediction.

Policy Type: deepfm Supports: scoring_policy

Premium Model

This model requires the Standard Plan or higher.

Hyperparameter tuning

embedding_dim: Dimensionality of feature embeddings (shared).
deep_hidden_units: Layer sizes for the deep MLP component.
activation_fn: Activation for deep layers.
dropout: Dropout rate for deep layers.
learning_rate: Optimizer learning rate.

V1 API

policy_configs:
  scoring_policy:
    policy_type: deepfm
    # Architecture
    embedding_dim: 16               # Dimensionality of feature embeddings (shared)
    deep_hidden_units: [128, 64, 32] # Layer sizes for the deep MLP component
    activation_fn: "relu"           # Activation for deep layers
    dropout: 0.2                    # Dropout rate for deep layers
    # Training Control
    learning_rate: 0.001            # Optimizer learning rate

Usage

Use this model when:

You need a strong CTR or conversion prediction model that can learn both low- and high-order feature interactions
You have rich categorical and continuous features and want to avoid manual feature crossing
You care about modeling interaction terms between many sparse features (e.g., user, item, and context features)
You want a neural scoring model that generalizes better than pure linear or FM-based approaches alone

Choose a different model when:

You have relatively simple feature sets and want a fast, tree-based baseline (use LightGBM or XGBoost)
You do not have the infrastructure or appetite to train and serve neural models in production
You mainly need retrieval-stage embeddings rather than point-wise scoring (use Two-Tower, ALS, or BeeFormer)
You primarily need to capture sequential patterns in user behavior (use SASRec, BERT4Rec, or other sequential models)

Use cases

CTR prediction for feeds, recommendations, and advertising with many categorical IDs
Ranking items in e-commerce or content platforms using rich user, item, and context features
Scoring candidates in a ranking stage after retrieval from embeddings or search
Personalization tasks where complex feature interactions drive performance

References

Guo, H., et al. (2017). DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. IJCAI.
Wang, R., et al. (2021). DLRM: An advanced open source deep learning recommendation model. arXiv. (Provides context on DLRM-style models).

Description​

Hyperparameter tuning​

Usage​

Use this model when:​

Choose a different model when:​

Use cases​

References​