Building your Model
With your data connected to Shaped, it’s time to configure your model. Follow this guide to populate the model configuration template:
model:
name: for_you_feed_v1
description: "Recommend items to show in the for you feed"
pagination_store_ttl: 0 # Seconds.
connectors:
fetch:
events: |
users: |
items: |
global_filters: |
personal_filters: |
syntax in the fetch section of the model config. Check out the DuckDB SQL Docs.
Model Config
Field | Description |
---|---|
name | Assigns a name to your model. It's common to describe the use case and append a version to help with development. |
description | Describes your model. |
pagination_store_ttl | Shaped handles the pagination of results for you by adding all served item ids to a 'pagination store' and then filtering out these items from the candidate item set of subsequent requests. The pagination_store_ttl configures the seconds we keep items in the pagination_store for. Speak to your Solution Engineer to find the optimal value for your use case. |
Connectors
Add the datasets created in the previous step to the connectors section of your model config.
connectors:
- id: likes
name: likes
type: Dataset
- id: impressions
name: impressions
type: Dataset
- id: reported_posts
name: reported_posts
type: Dataset
- id: users
name: users
type: Dataset
- id: user_favourite_categories
name: user_favourite_categories
type: Dataset
- id: items
name: items
type: Dataset
- id: items_categories
name: items_categories
type: Dataset
Events
Events must include user_id, item_id, created_at, event_value, and label.
event_value
: Describes the event and is useful for analysis.label
: Numerical values where anything greater than 0 is positive, and 0 or less is negative. This helps in weighting the events.
For simplicity, you might start with a binary label system.
events: |
SELECT
user_id,
item_id,
created_at,
1 as label,
'like' as event_value
FROM likes
UNION ALL
SELECT
user_id,
item_id,
created_at,
0 as label,
'impression' as event_value
FROM impressions
UNION ALL
SELECT
user_id,
item_id,
created_at,
0 as label,
'reported_post' as event_value
FROM reported_posts
Users and Items
In the users and items sections, include all the users and items on your platform along with attributes that will help the model understand draw connections between them.
users: |
SELECT
user_id,
created_at,
country,
occupation,
gender
FROM users
items: |
SELECT
i.item_id,
i.created_at,
i.price,
i.deleted,
i.public,
ARRAY_AGG(DISTINCT ic.category) AS categories
FROM items i
LEFT JOIN items_categories ic on ic.item_id = i.item_id
WHERE
i.deleted = false
AND i.public = true
Global Filters (Optional)
The global filters define the items that should be filtered out for all users. For example, you may want your rankings to guarantee that all of the following items are filtered out:
- Products that are out of stock
- Certain categories
- Older items
The global filter requires only an item_id
column. Say you wanted to exclude a certain
category for all users:
global_filters: |
SELECT
item_id
FROM items_categories
WHERE
category = 'category_to_exclude'
Personal Filters (Optional)
The personal filters define items that should be filtered out for a specific user. For example, you may want your model to:
- Filter out all videos a user has watched before
- Remove posts from blocked and muted users
- Remove items which are not available in the user's country
The personal filter must output a view containing user_id
and item_id
pairs to be
filtered out. Specifically, if there is a user_id, item_id row, then that user_id will
never be shown that item_id.
If you wanted filter any items a user has seen before:
personal_filters: |
SELECT
user_id,
item_id
FROM impressions
Creating Your Model With The Shaped CLI
Putting it all together, we get the following complete model config and can now use the Shaped CLI to create your model.
model:
name: for_you_feed_v1
description: "Recommend items to show in the for you feed"
pagination_store_ttl: 0 # Seconds.
connectors:
- id: likes
name: likes
type: Dataset
- id: impressions
name: impressions
type: Dataset
- id: reported_posts
name: reported_posts
type: Dataset
- id: users
name: users
type: Dataset
- id: user_favourite_categories
name: user_favourite_categories
type: Dataset
- id: items
name: items
type: Dataset
- id: items_categories
name: items_categories
type: Dataset
fetch:
events: |
SELECT
user_id,
item_id,
created_at,
1 as label,
'like' as event_value
FROM likes
UNION ALL
SELECT
user_id,
item_id,
created_at,
0 as label,
'impression' as event_value
FROM impressions
UNION ALL
SELECT
user_id,
item_id,
created_at,
0 as label,
'reported_post' as event_value
FROM reported_posts
users: |
SELECT
user_id,
created_at,
country,
occupation,
gender
FROM users
items: |
SELECT
i.item_id,
i.created_at,
i.price,
i.deleted,
i.public,
ARRAY_AGG(DISTINCT ic.category) AS categories
FROM items i
LEFT JOIN items_categories ic on ic.item_id = i.item_id
WHERE
i.deleted = false
AND i.public = true
global_filters: |
SELECT
item_id
FROM items_categories
WHERE
category = 'category_to_exclude'
personal_filters: |
SELECT
user_id,
item_id
FROM impressions
Use the Shaped CLI to create your model:
shaped create-model --file model_config.yaml
You'll be able to see your model build on the Shaped Dashboard.