Skip to main content

Filtering Results

Shaped provides a way to filter the items returned by the Rank API based on their metadata (i.e. the columns found in the items fetch queries). You can do this both for a personalized ranking query or a non-personalized one. There's two primary use-cases that this is useful for:

  1. Personalized search: Filtering out items based on a user defined keyword matching or metadata specific query.
  2. Category pages: Filtering out items for a specific recommendation UI element (e.g. carousel or feed). This means that you can create one Shaped model and use it for a variety of different carousels based on prior domain knowledge you know will resonate with your customers.

In this guide we'll show you how to use the filter predicate feature to power some of these use-cases.

Supported Operations

The filter predicate language is a standard SQL expression, e.g. "category = 'sports'" or "publish_year >= 2023". Here are the currently supported operators:

* >, >=, <, <=, =
* AND, OR, NOT
* IS NULL, IS NOT NULL
* IS TRUE, IS NOT TRUE, IS FALSE, IS NOT FALSE
* IN
* LIKE, NOT LIKE
* regexp_match(column, pattern)
* CAST
* array_has(sequential_column, value)
* array_has_any(sequential_column, values)
* array_has_all(sequential_column, values)

For example, the following predicate string is acceptable:

((label IN [10, 20]) AND (note.email IS NOT NULL))
OR NOT note.created

Filter Predicate Examples

Filtering by Category

shaped rank ---model-name personalized_video_search -user_id 3 --limit 5 \
--filter_predicate 'category IN ["sports", "news"]'

Filtering a Sequence Category Column

shaped rank --model-name personalized_video_search --user_id 3 --limit 5 \
--filter_predicate 'array_has_any(category_sequence, ["sports", "news"])'

Filtering By Year

shaped rank --model-name personalized_video_search --user_id 3 --limit 5 \
--filter_predicate 'publish_year >= 2023'

Note that if the user_id is provided the filtered results are personalized to that user. If the user_id is not provided the filtered results return trending non-personalized results by default.

info

Internally we use the Lance data format to support the filter predicate, take a look at their the docs here for more information.