Avoiding filter bubbles
Without diversity controls, feeds and search results can become repetitive. Users see the same types of content repeatedly, which limits discovery and engagement. Shaped provides three rerankers to introduce variety: diversity, exploration, and boosting.
Use diversity when you want to mix less popular (but still relevant) items into the feed. For example, Spotify uses diversity to ensure less popular songs in a genre get discovered by users.
Use exploration when you want to mix more dissimilar results into the feed. For example, Youtube uses exploration to mix popular videos from new genres that you may not have seen before, into your feed.
Use boosting when you want to put promoted items at the top of the list. For example, marketplaces can promote items with a higher margin to boost overall profitability.
Diversity
How candidates are selected: Diversity reorders items from your existing result set. It uses Maximal Marginal Relevance (MMR) to balance relevance with dissimilarity. Items are selected iteratively so that each new item is both relevant and different from those already in the list.
When to use it: Use diversity when you want to reduce redundancy within the same candidate pool. For example, a product feed that would otherwise show five similar blue shirts in a row can be reordered to intersperse different categories or styles.
SELECT score(expression='m1', input_user_id='$user_id') AS s,
diversity(score=s, strength=0.3) AS r, *
FROM retrieve(similarity(embedding_ref='als', limit=50,
encoder='precomputed_user', input_user_id='$user_id'))
ORDER BY r LIMIT 20
The strength parameter (0.0 to 1.0) controls the trade-off. At 0, results follow relevance only. At 1, diversity is maximized, which may reduce relevance.
Exploration
How candidates are selected: Exploration injects items from a separate retriever into the main results. All items from the exploration retriever are treated with equal weight. The strength parameter controls how often exploration items are chosen over top-scored items from the main retriever.
When to use it: Use exploration when you want to surface content that might not rank highly by relevance alone. Examples include new releases, items from underrepresented categories, or content the user has not interacted with before.
SELECT score(expression='click_through_rate', input_user_id='$user_id',
input_interactions_item_ids='$interaction_item_ids') AS s,
exploration(score=s, retriever='explore', strength=0.3) AS r, *
FROM retrieve(
similarity(embedding_ref='als', input_user_id='$user_id', limit=100, name='main'),
filter(where='category = "new_releases"', name='explore')
)
ORDER BY r LIMIT 20
You can also specify the exploration retriever inline:
SELECT score(expression='click_through_rate', input_user_id='$user_id',
input_interactions_item_ids='$interaction_item_ids') AS s,
exploration(score=s, retriever=filter(where='category = "new_releases"'),
strength=0.3) AS r, *
FROM retrieve(similarity(embedding_ref='als', input_user_id='$user_id', limit=100))
ORDER BY r LIMIT 20
Boosting
How candidates are selected: Boosting injects items from a retriever where each item has a boost value in its metadata. Unlike exploration, items are weighted by their boost value: higher boost values increase the chance of being selected.
When to use it: Use boosting when you want to promote specific items based on your own logic. Examples include sponsored products, editorial picks, or items you want to surface for business reasons.
SELECT score(expression='click_through_rate', input_user_id='$user_id',
input_interactions_item_ids='$interaction_item_ids') AS s,
boosted(score=s, retriever='boosted_items', strength=0.3) AS r, *
FROM retrieve(
similarity(embedding_ref='als', input_user_id='$user_id', limit=100, name='main'),
filter(where='boost > 0', name='boosted_items')
)
ORDER BY r LIMIT 20
Items must have a boost column in their metadata. Set boost = 1 for items you want to promote, or use higher values to prioritize some promoted items over others.
Combining rerankers
You can apply multiple rerankers in a single query. List them in the SELECT clause with aliases, then order by the final one:
SELECT score(expression='m1', input_user_id='$user_id') AS s,
diversity(score=s, strength=0.3) AS d,
exploration(score=s, retriever='promoted', strength=0.2) AS e, *
FROM retrieve(
similarity(embedding_ref='als', input_user_id='$user_id', limit=100, name='main'),
filter(where='is_promoted = true', name='promoted')
)
ORDER BY e LIMIT 20