Skip to main content

Building your Model

With your data connected to Shaped, it’s time to configure your model. Follow this guide to populate the model configuration template:

model_config_template.yaml
model:
name: for_you_feed_v1
description: "Recommend items to show in the for you feed"
connectors:
fetch:
events: |
users: |
items: |
global_filters: |
personal_filters: |
Shaped uses DuckDB SQL

syntax in the fetch section of the model config. Check out the DuckDB SQL Docs.

Connectors

Add the datasets created in the previous step to the connectors section of your model config.

connectors:
- id: likes
name: likes
type: Dataset
- id: impressions
name: impressions
type: Dataset
- id: reported_posts
name: reported_posts
type: Dataset
- id: users
name: users
type: Dataset
- id: user_favourite_categories
name: user_favourite_categories
type: Dataset
- id: items
name: items
type: Dataset
- id: items_categories
name: items_categories
type: Dataset

Events

Events must include user_id, item_id, created_at, event_value, and label.

  • event_value: Describes the event and is useful for analysis.
  • label: Numerical values where anything greater than 0 is positive, and 0 or less is negative. This helps in weighting the events.

For simplicity, you might start with a binary label system.

events: |
SELECT
user_id,
item_id,
created_at,
1 as label,
'like' as event_value
FROM likes

UNION ALL

SELECT
user_id,
item_id,
created_at,
0 as label,
'impression' as event_value
FROM impressions

UNION ALL

SELECT
user_id,
item_id,
created_at,
0 as label,
'reported_post' as event_value
FROM reported_posts

Users and Items

In the users and items sections, include all the users and items on your platform along with attributes that will help the model understand draw connections between them.

users: |
SELECT
user_id,
created_at,
country,
occupation,
gender
FROM users
items: |
SELECT
i.item_id,
i.created_at,
i.price,
i.deleted,
i.public,
ARRAY_AGG(DISTINCT ic.category) AS categories
FROM items i
LEFT JOIN items_categories ic on ic.item_id = i.item_id
WHERE
i.deleted = false
AND i.public = true

Global Filters (Optional)

The global filters define the items that should be filtered out for all users. For example, you may want your rankings to guarantee that all of the following items are filtered out:

  • Products that are out of stock
  • Certain categories
  • Older items

The global filter requires only an item_id column. Say you wanted to exclude a certain category for all users:

global_filters: |
SELECT
item_id
FROM items_categories
WHERE
category = 'category_to_exclude'

Personal Filters (Optional)

The personal filters define items that should be filtered out for a specific user. For example, you may want your model to:

  • Filter out all videos a user has watched before
  • Remove posts from blocked and muted users
  • Remove items which are not available in the user's country

The personal filter must output a view containing user_id and item_id pairs to be filtered out. Specifically, if there is a user_id, item_id row, then that user_id will never be shown that item_id.

If you wanted filter any items a user has seen before:

personal_filters: |
SELECT
user_id,
item_id
FROM impressions

Creating Your Model With The Shaped CLI

Putting it all together, we get the following complete model config and can now use the Shaped CLI to create your model.

model_config.yaml
model:
name: for_you_feed_v1
description: "Recommend items to show in the for you feed"
connectors:
- id: likes
name: likes
type: Dataset
- id: impressions
name: impressions
type: Dataset
- id: reported_posts
name: reported_posts
type: Dataset
- id: users
name: users
type: Dataset
- id: user_favourite_categories
name: user_favourite_categories
type: Dataset
- id: items
name: items
type: Dataset
- id: items_categories
name: items_categories
type: Dataset
fetch:
events: |
SELECT
user_id,
item_id,
created_at,
1 as label,
'like' as event_value
FROM likes

UNION ALL

SELECT
user_id,
item_id,
created_at,
0 as label,
'impression' as event_value
FROM impressions

UNION ALL

SELECT
user_id,
item_id,
created_at,
0 as label,
'reported_post' as event_value
FROM reported_posts
users: |
SELECT
user_id,
created_at,
country,
occupation,
gender
FROM users
items: |
SELECT
i.item_id,
i.created_at,
i.price,
i.deleted,
i.public,
ARRAY_AGG(DISTINCT ic.category) AS categories
FROM items i
LEFT JOIN items_categories ic on ic.item_id = i.item_id
WHERE
i.deleted = false
AND i.public = true
global_filters: |
SELECT
item_id
FROM items_categories
WHERE
category = 'category_to_exclude'
personal_filters: |
SELECT
user_id,
item_id
FROM impressions

Use the Shaped CLI to create your model:

shaped create-model --file model_config.yaml

You'll be able to see your model build on the Shaped Dashboard.