Skip to main content

Item (Content Similarity)

Description

The Item-Content Similarity policy implements content-based filtering where item relevance is determined by comparing item content. It requires item embeddings based on their attributes (e.g., text embeddings from descriptions via LLMs, categorical embeddings). A user's preference profile is dynamically computed by pooling (e.g., averaging, max-pooling) the embeddings of items they have positively interacted with. Candidate items are scored based on the similarity (e.g., cosine, dot product) between their content embedding and the user's aggregated preference profile embedding.

Policy Type: item-content-similarity Supports: embedding_policy, scoring_policy

Configuration Example

scoring_policy_item_content_similarity.yaml
policy_configs:
embedding_policy: # Can also be used under embedding_policy
policy_type: item-content-similarity
pool_fn: "mean" # Strategy for pooling liked item embeddings ('mean', 'max', etc.)
distance_fn: "cosine" # Similarity metric ('cosine', 'dot')

References

  • Lops, P., de Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State of the art and trends. Recommender systems handbook, 73-105. (General overview).