Item (Content Similarity)
Description
The Item-Content Similarity policy implements content-based filtering where item relevance is determined by comparing item content. It requires item embeddings based on their attributes (e.g., text embeddings from descriptions via LLMs, categorical embeddings). A user's preference profile is dynamically computed by pooling (e.g., averaging, max-pooling) the embeddings of items they have positively interacted with. Candidate items are scored based on the similarity (e.g., cosine, dot product) between their content embedding and the user's aggregated preference profile embedding.
Policy Type: item-content-similarity
Supports: embedding_policy
, scoring_policy
Configuration Example
scoring_policy_item_content_similarity.yaml
policy_configs:
embedding_policy: # Can also be used under embedding_policy
policy_type: item-content-similarity
pool_fn: "mean" # Strategy for pooling liked item embeddings ('mean', 'max', etc.)
distance_fn: "cosine" # Similarity metric ('cosine', 'dot')
References
- Lops, P., de Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State of the art and trends. Recommender systems handbook, 73-105. (General overview).