Value Modeling
Shaped provides a value modeling interface that enables fine-grained control and flexibility over your scoring objective. The value model allows you to tap directly into your scoring function and introduce custom logic, significantly enhancing the modeling capabilities that you can achieve with Shaped.
When to Value Model
The value model can be used to enhance all modeling use cases, and can be especially useful when optimizing complex business objectives.
Some common use cases that can greatly benefit from using the value model:
- Combining multiple scoring objectives.
- Balancing personalization with custom business logic.
- Switching between multiple models at score time.
How it Works
The value model allows you to combine scores from multiple scoring policies, as well as user, item, and interaction feature values, into a single mathematical expression. This defines a second-order model that represents your final scoring objective.
The value model utilizes the Jinja framework to evaluate expressions, allowing your value model to be conveniently written with native python code.
Supported Expressions
Apart from python, any jinja expressions can be utilized in the value model. Notably, Jinja Filters will be the most useful for common value model use cases.
Alongside native python and jinja, the value model supports any math function from
python’s math library. These do not need
to be prefixed with math
and can be accessed directly by their function name. The
value model also supports the len
, sum
, max
, min
, and abs
python functions.
The value model supports numerical, binary, numerical list, and timestamp feature types
when using user, item, and/or interaction features. User features and item features are
accessed with the user
and item
prefixes. Interaction features can be accessed with
the user.recent_interactions
prefix, which contains a list of each interaction
feature for the user’s most recent interactions.
Timestamp features are always evaluated as POSIX timestamps in seconds. Conveniently,
Shaped provides a now_seconds()
function that can be used in the value model to
return the current POSIX timestamp in seconds.
Accessing the Value Model
In order to access the value model, your model must be defined as a Score Ensemble Policy. The example below defines a score ensemble policy with two model policies, lightgbm and bert4rec, and equally weights their results:
policy_configs:
scoring_policy:
policy_type: score-ensemble
policies:
- policy_type: lightgbm
max_depth: 6
objective: binary
- policy_type: bert4rec
hidden_size: 128
value_model: 0.5 * lightgbm + 0.5 * bert4rec
The value model can also be specified at inference time. The following rank request provides an example, where the value model for the example above is overridden:
curl https://api.shaped.ai/v1/models/{model_name}/rank \
-X POST \
-H "x-api-key: <API_KEY>" \
-H "Content-Type: application/json"
-d '{
"user_id": "83NSLX",
"config": {
"value_model": 0.3 * lightgbm + 0.7 * bert4rec
}
},
}'
Common Examples
Below are some value model examples for common use cases. These examples assume the following model definition:
model:
policy_configs:
scoring_policy:
policy_type: score-ensemble
policies:
- policy_type: lightgbm
max_depth: 6
objective: binary
- policy_type: toplist
value_model: # Fill this in with examples below!
fetch:
events: |
SELECT
item_id,
user_id,
created_at,
label
FROM interactions
users: |-
SELECT
user_id
FROM users
items: |
SELECT
item_id,
quality_score # Can represent a custom quality score defined by business logic.
FROM items
Example of a value model that uses math expressions and an item feature:
value_model: 2 * lightgbm + 3 * log(item.quality_score)
Example of a value model that implements an exponential decay function based on the number of hours since item creation:
value_model: exp(-0.5 * (now_seconds() - item.created_at) / 3600) * lightgbm
Example of a value model that switches between toplist and lightgbm based on the number of user interactions (using jinja filters):
value_model: ((user.recent_interactions.label|select('gt', 0)|list|count) => 5) *
lightgbm + ((user.recent_interactions.label|select('gt', 0)|list|count) < 5) * toplist
Example of a value model that implements an exponential decay function based on the distance (given lat, long coordinates) between the user and item using the Haversine Formula:
value_model: exp(-0.015 * (2 * 6371 * asin(sqrt(0.5 - cos(radians(item.lat - user.lat))/2
+ cos(radians(user.lat)) * cos(radians(item.lat))
* (1 - cos(radians(item.long - user.long)))/2)))) * lightgbm
Note that this will use the user's location from the users table. For use cases that
want to compare with the user's live location, override the value model at inference
time and replace user.lat
and user.long
with live coordinate values.
Final Considerations: Choosing Parameters
Choosing the parameters for your value model can be hard. Since the business objectives captured by the value model can be complex and the parameter space is uncountably large, the best results are typically achieved by leveraging a manual component alongside model tuning. We at Shaped recommend the following steps to ease the process and achieve the best results for your use case:
Define your objectives: Start by listing out the different business objectives you want to optimize for. Decide which objectives should be defined by scoring policies and which ones should be defined by user/item features. These can always change and don't need to be perfect the first time.
Start with a heuristic: Think of a general structure for your value model expression. The easiest one to start with is linear (e.g.
a * model_a + b * model_b + c * model_c
), but more complex use cases may be better modelled by multiplicative or logarithmic terms. jA/B test: The best way to tune parameters of a value model are by looking at online results. Try out the value model on a subset of users, modify the coefficients and/or value model structure, track the results, then modify again until you're satisfied with the results. The objective-tuned model policies will be doing the heavy lifting while you're at the control panel.