Skip to main content

Metadata Filtering

Shaped provides a way to filter the items returned by the Rank API based on their metadata (i.e. the columns found in the items fetch queries). You can do this both for a personalized ranking query or a non-personalized one. There's two primary use-cases that this is useful for:

  1. Personalized search, e.g. filtering out items based on a user defined keyword matching or metadata specific query.
  2. Category pages, e.g. filtering out items for a specific recommendation UI element (e.g. carousel or feed). This means that you can create one Shaped model and use it for a variety of different carousels based on prior domain knowledge you know will resonate with your customers.

In this guide we'll show you how to use the metadata filtering feature to power some of these use-cases.

Enabling Metadata Filtering

Metadata filtering is disabled by default and needs to be enabled by setting metadata_retrieval_enabled to true in the model config of the Create Model request. Here's an example of how to do this making a request with the CLI:

name: personalized_video_search
metadata_retrieval_enabled: true
- type: BigQuery
id: bigquery_connector
location: us-west1
project_id: rocket-ship-234123
dataset: video_db
events: |
SELECT user_id, item_id, created_at, (CASE WHEN event = 'click' THEN 1 ELSE 0 END) AS label
FROM bigquery_connector.click_events
items: |
SELECT item_id, YEAR(created_at) AS publish_year, description, category, creator_id, region
FROM bigquery_connector.videos
shaped create-model --file personalized_video_search

Metadata Filtering in the Rank API

Our Rank API endpoints (rank and similar) allow an optional retrieval argument to be provided that defines the metadata filter predicate for your ranking request. The metadata filter predicate language is a subset of MongoDB's query and projection operators. It includes $or and $and operators to combine multiple predicates together and can be applied to one or many metadata columns together. Here's some examples using our CLI but they can trivially be mapped to our REST endpoint when integrating into your application:

Filtering by category:

shaped rank ---model-name personalized_video_search -user_id 3 --limit 5 \
--retrieve '{"category": {"$in": ["sports", "news"]}}'

Filtering by year:

shaped rank --model-name personalized_video_search --user_id 3 --limit 5 \
--retrieve '{"publish_year": {"$gte": 2023}}'

Note that if the user_id is provided the filtered results are personalized to that user. If the user_id is not provided the filtered results return trending non-personalized results by default.

Metadata query language

Here's the rest of the selectors we provide within the metadata filter predicate language:

$eqMatches values that are equal to a specified value.
$gtMatches values that are greater than a specified value.
$gteMatches values that are greater than or equal to a specified value.
$inMatches any of the values specified in an array.
$ltMatches values that are less than a specified value.
$lteMatches values that are less than or equal to a specified value.
$neMatches all values that are not equal to a specified value.
$ninMatches none of the values specified in an array.


Metadata filtering is a powerful feature that can be used for many discovery and search use-cases. We hope you find it useful!