Building a Personalized HackerNews "For You" Feed
In this tutorial, you will build a complete, personalized "For You" feed for HackerNews. You will learn how to:
- Build a modern web client for HackerNews using an AI assistant.
- Set up a backend with Supabase to cache data and handle user authentication.
- Ingest post and event data into Shaped.
- Define a custom ranking algorithm that combines popularity, recency, and content-based personalization.
- Integrate the personalized feed back into your web client.
The final result will be a configurable, real-time "For You" feed, which you can see in action at hn.shaped.ai.
This tutorial is broken down into two main parts: building the client and building the feed.
Part 1: Building the HackerNews Client
Our first goal is to create a functional HackerNews client that we can later enhance with a personalized feed.
Step 1: Generate the Base Client with an AI Assistant
In the age of AI tools, bootstrapping a standard application is faster than ever. We will use lovable.dev to generate the initial code.
- Navigate to lovable.dev.
- Enter the following prompt:
"Can you build me a clone of HackerNews with a top and new feed?"
The AI assistant will generate a minimal but functional web client that pulls data from the public HackerNews API.
What's Missing? After testing the initial generated client, a few key features are missing:
- Pagination to scroll beyond the first page.
- Full mobile support.
- The ability for users to log in and upvote, which requires authentication.
Step 2: Add Authentication and User Actions with a Backend
The official HackerNews API is read-only. To handle user actions like logging in and upvoting, we need to use the unofficial REST API and a lightweight backend to handle potential CORS issues. We'll use Supabase for this.
- Prompt for a Backend: Use your AI assistant to add a backend. It will likely suggest using Supabase edge-functions.
- Set up Supabase: Create a Supabase account and link it to your project. The assistant will generate an
hn-proxy
edge function to handle authentication and POST requests (like upvotes and favorites). - Create a Database: Within Supabase, set up a serverless Postgres database. Create two tables:
posts
: To cache post data for performance and for use as features in our ranking model.events
: To store user actions like favorites, which will be our personalization signal.
The posts table schema includes fields like
url
, score
, and published_at
, which are perfect features for ranking.
Step 3: Polish the Client
With the core functionality in place, use the rest of your time to polish the user experience. You can use your AI assistant to iterate quickly on UI/UX issues:
- Ensure upvotes are colored and persist across pages.
- Add skeleton UIs for a smoother loading experience.
- Prevent duplicate posts from being saved to the database.
At the end of this part, you should have a polished, functional HackerNews client with authentication that is ready for the next step.
Part 2: Building the "For You" Feed with Shaped
Now, we'll replace the static "top" feed with a dynamic, personalized "For You" feed powered by Shaped.
Step 1: Understand the Ranking Algorithm
The classic HackerNews algorithm is a great starting point. It balances two key concepts:
- Popularity:
(upvotes - 1)
- Recency:
(hours_since_post + 2) ^ 1.8
The formula is:
rank = popularity / recency_decay
To add personalization, we will introduce a content-filtering term. We want to boost the rank of items that are textually similar to items a user has previously favorited.
Our new formula will be:
rank = (popularity * (1 + content_similarity)) / recency_decay
Why Content-Filtering First? We start with content-filtering because it's highly effective even with very little data (it works perfectly for a single user). Collaborative-filtering, which finds similar users, requires a much larger dataset to be effective.
Step 2: Connect Your Data to Shaped
To build the model, Shaped needs access to your post and event data.
- Create a Postgres Connector: In the Shaped dashboard, create a Postgres Connector that syncs with your Supabase database every 10 minutes. This will keep your
posts
data fresh. - Create a Custom API Connector: Create a Custom API Connector for your
events
. This provides an HTTP endpoint. - Update Your Client: Modify your client to send a real-time event to this API endpoint every time a user favorites an item. This enables in-session personalization, allowing the feed to react instantly to user actions.
Step 3: Define the Shaped Model
In Shaped, a model is defined by a fetch
block (how to get the data) and a model
block (how to rank it).
The fetch
Configuration
This SQL block tells Shaped how to select and prepare the data from your connectors.
# Select all items features and dedupe
items: |
SELECT
item_id, title, url, score, by_author, published_at,
descendants, host, updated_at
FROM (
SELECT
id AS item_id, title, url, score, by_author, published_at,
descendants, host, updated_at,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) as rn
FROM shaped_hackernews_posts
) AS ranked_items
WHERE rn = 1
# Select all favorite events unless they've been subsequently unfavorited
events: |
SELECT
user_id, item_id, published_at, event_type AS event_value,
(CASE WHEN event_type = 'FAVORITE' THEN 1 ELSE 0 END) AS label
FROM (
SELECT
user_id, item_id, published_at, event_type,
ROW_NUMBER() OVER (PARTITION BY user_id, item_id ORDER BY published_at DESC) as rn
FROM shaped_hackernews_events
WHERE event_type IN ('FAVORITE', 'UNFAVORITE')
) AS ranked_events
WHERE rn = 1 AND event_type = 'FAVORITE'
The model
Configuration
This block defines the ranking pipeline, including the value_model
which contains our custom ranking formula.
name: hackernews_for_you
language_model_name: Alibaba-NLP/gte-modernbert-base
pagination_store_ttl: 600
inference_config:
query:
retrieve:
- name: new
limit: 300
order_by:
order_type: COLUMN
columns:
- name: published_at
policy_configs:
scoring_policy:
policy_type: score-ensemble
value_model: |
# Popularity
(item.score - 1)
# Personalization
* (1 + cosine_similarity(
text_encoding(item),
pooled_text_encoding(user.recent_interactions)
))
# Time Decay
/ (((now_seconds() - item.published_at) / 3600) + 2) ** 1.8
How does
content_similarity
work here? Thecosine_similarity(...)
function calculates the similarity between each candidate item's text embedding and a user's taste profile. Thepooled_text_encoding(user.recent_interactions)
function automatically creates this taste profile by averaging the text embeddings of the last 30 items the user has favorited.
After you register this model with Shaped, it will build the ranking pipeline. You can test it from the command line:
curl -X POST "https://api.shaped.ai/v1/models/hackernews_for_you/rank" \
-H "x-api-key: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
--data '{
"user_id": "tullie",
"limit": 30,
"return_metadata": true
}'
Step 4: Integrate the "For You" Feed into the Client
The final step is to add a "For You" tab to your web client that calls this new Shaped Rank API endpoint.
The value_model
in Shaped is configurable in real-time, meaning you can experiment with different formulas without redeploying. For example, you can create a more balanced formula by normalizing the popularity and personalization scores.
Top Feed (HackerNews Classic):
(item.score - 1) / ((((now_seconds() - item.published_at) / 3600) + 2) ** 1.8)
Personalized Feed (Balanced):
((item.score / 1000) + cosine_similarity(text_encoding(item), pooled_text_encoding(user.recent_interactions))) / ((((now_seconds() - item.published_at) / 3600) + 2) ** 1.8)
The / 1000
term normalizes the large item.score
values to be on a similar scale to the cosine_similarity
score, preventing popularity from completely dominating the personalization signal.
In the top screen shot we have an example using the top feed and the bottom screenshot is a personalzied feed. Notice how the personalized feed boosts content about AI higher in the ranking.
You have now successfully built a real-time, personalized "For You" feed for HackerNews!
Next Steps and Future Improvements
This tutorial creates a simple but powerful content-based feed. Here are some ways you could extend it:
- Add Collaborative Signals: Once you have enough user data, you can introduce collaborative filtering to find items popular among similar users.
- Improve Exploration: Use a multi-armed bandit algorithm to more intelligently explore new content instead of relying solely on recency.
- Add Semantic Search: Replace the standard keyword search with a powerful vector search using Shaped's Search API.
- Enable Personal Configurability: Allow end-users to tune their own ranking formulas directly in the UI.