Value Modeling
Shaped provides a value modeling interface that enables fine-grained control and flexibility over your scoring objective. The value model allows you to tap directly into your scoring function and introduce custom logic, significantly enhancing the modeling capabilities that you can achieve with Shaped.
When to Value Model
The value model can be used to enhance all modeling use cases, and can be especially useful when optimizing complex business objectives.
Some common use cases that can greatly benefit from using the value model:
- Combining multiple scoring objectives.
- Balancing personalization with custom business logic.
- Switching between multiple models at score time.
How it Works
The value model allows you to combine scores from multiple scoring policies, as well as user, item, and interaction feature values, into a single mathematical expression. This defines a second-order model that represents your final scoring objective.
The value model utilizes the Jinja framework to evaluate expressions, allowing your value model to be conveniently written with native python code.
Supported Expressions
Apart from python, any jinja expressions can be utilized in the value model. Notably, Jinja Filters will be the most useful for common value model use cases.
Alongside native python and jinja, the value model supports any math function from
python’s math library. These do not need
to be prefixed with math
and can be accessed directly by their function name. The
value model also supports the len
, sum
, max
, min
, and abs
python functions.
The value model supports numerical, binary, numerical list, and timestamp feature types
when using user, item, and/or interaction features. User features and item features are
accessed with the user
and item
prefixes. Interaction features can be accessed with
the user.recent_interactions
prefix, which contains a list of each interaction
feature for the user’s most recent interactions.
Timestamp features are always evaluated as POSIX timestamps in seconds. Conveniently,
Shaped provides a now_seconds()
function that can be used in the value model to
return the current POSIX timestamp in seconds.
Accessing the Value Model
In order to access the value model, your model must be defined as a Score Ensemble Policy. The example below defines a score ensemble policy with two model policies, lightgbm and bert4rec, and equally weights their results:
policy_configs:
scoring_policy:
policy_type: score-ensemble
policies:
- policy_type: lightgbm
max_depth: 6
objective: binary
- policy_type: bert4rec
hidden_size: 128
value_model: 0.5 * lightgbm + 0.5 * bert4rec
The value model can also be specified at inference time. The following rank request provides an example, where the value model for the example above is overridden:
curl https://api.shaped.ai/v1/models/{model_name}/rank \
-X POST \
-H "x-api-key: <API_KEY>" \
-H "Content-Type: application/json"
-d '{
"user_id": "83NSLX",
"config": {
"value_model": 0.3 * lightgbm + 0.7 * bert4rec
}
},
}'
Multi-Objective Optimization
An especially powerful use-case for the value model is multi-objective optimization, which can be done by training multiple models to predict different events, then combining their scores in the value model to produce a final multi-objective score.
To do this via Shaped, models can be trained on a subset of events by using the
event_values
field. If the event_value
field exists on the events in the
fetch config, the following score ensemble definition will weigh the results between
two separate models trained to predict clicks and purchases:
policy_configs:
scoring_policy:
policy_type: score-ensemble
policies:
- policy_type: lightgbm
max_depth: 6
objective: binary
name: click_model
event_values:
- click
- impression
- policy_type: lightgbm
max_depth: 6
objective: binary
name: purchase_model
event_values:
- purchase
- impression
value_model: 0.3 * click_model + 0.7 * purchase_model
The fetch config required for this model is defined in the example below.
Common Examples
Below are some value model examples for common use cases. These examples assume the following model definition:
model:
policy_configs:
scoring_policy:
policy_type: score-ensemble
policies:
- policy_type: lightgbm
max_depth: 6
objective: binary
name: click_model
event_values:
- click
- impression
- policy_type: lightgbm
max_depth: 6
objective: binary
name: purchase_model
event_values:
- purchase
- impression
- policy_type: lightgbm
max_depth: 6
objective: binary
name: all_events_model
- policy_type: toplist
value_model: # Fill this in with examples below!
fetch:
events: |
SELECT
user_id,
item_id,
created_at,
CASE
WHEN event_value IN ('click', 'purchase') THEN 1
WHEN event_value = 'impression' THEN 0
ELSE NULL
END AS label,
event_value
FROM interactions
users: |-
SELECT
user_id
FROM users
items: |
SELECT
item_id,
quality_score # Can represent a custom quality score defined by business logic.
FROM items
Example of a multi-objective value model that combines model scores with a log-transformed item feature.
value_model: 2 * click_model + 6 * purchase_model + 3 * log(item.quality_score)
Example of a value model that implements an exponential decay function based on the number of hours since item creation:
value_model: exp(-0.5 * (now_seconds() - item.created_at) / 3600) * all_events_model
Example of a value model that falls back to toplist when a user has fewer than 5 interactions, using jinja filters:
value_model: ((user.recent_interactions.label|select('gt', 0)|list|count) >= 5) *
all_events_model + ((user.recent_interactions.label|select('gt', 0)|list|count) < 5) * toplist
Example of a value model that implements an exponential decay function based on the distance (given lat, long coordinates) between the user and item using the Haversine Formula:
value_model: exp(-0.015 * (2 * 6371 * asin(sqrt(0.5 - cos(radians(item.lat - user.lat))/2
+ cos(radians(user.lat)) * cos(radians(item.lat))
* (1 - cos(radians(item.long - user.long)))/2)))) * all_events_model
This last example showcases how a more complex formula can be built directly into the
value model. For your convenience, Shaped provides the haversine_distance
function
so you don't need to write it out again:
value_model: haversine_distance(item.lat, item.long, user.lat, user.long) * all_events_model
Note also that this will use the user's location from the users table. For use cases
that want to compare with the user's live location, override the value model at
inference time and replace user.lat
and user.long
with live coordinate values.
Final Considerations: Choosing Parameters
Choosing the parameters for your value model can be hard. Since the business objectives captured by the value model can be complex and the parameter space is uncountably large, the best results are typically achieved by leveraging a manual component alongside model tuning. We at Shaped recommend the following steps to ease the process and achieve the best results for your use case:
Define your objectives: Start by listing out the different business objectives you want to optimize for. Decide which objectives should be defined by scoring policies and which ones should be defined by user/item features. These can always change and don't need to be perfect the first time.
Start with a heuristic: Think of a general structure for your value model expression. The easiest one to start with is linear (e.g.
a * model_a + b * model_b + c * model_c
), but more complex use cases may be better modelled by multiplicative or logarithmic terms. jA/B test: The best way to tune parameters of a value model are by looking at online results. Try out the value model on a subset of users, modify the coefficients and/or value model structure, track the results, then modify again until you're satisfied with the results. The objective-tuned model policies will be doing the heavy lifting while you're at the control panel.