Skip to main content

Music Recommendations (LastFM)

This tutorial demonstrates how to configure a recommendation engine using the LastFM-360k dataset. The dataset contains listening data (play counts) for approximately 360,000 users on 160,000 artists. This example uses the local table connector; the same approach applies to other supported connectors.

Accompanying notebook

CLI Setup

Install the CLI

pip install shaped
info

Shaped supports Python 3.8 to 3.11. See installation instructions if you need to install pip.

Initialize the CLI

shaped init --api-key <YOUR_API_KEY>

If you don't have an API key, see How to get an API key.

Data Preparation

Download the dataset

Download the dataset (this step may take approximately 10 minutes):

CLI
curl http://mtg.upf.edu/static/datasets/last.fm/lastfm-dataset-360K.tar.gz -o lastfm-dataset-360K.tar.gz
tar -xzf lastfm-dataset-360K.tar.gz

The dataset contains two tab-separated files:

  • lastfm-dataset-360K/usersha1-artmbid-artname-plays.tsv: Play counts
  • lastfm-dataset-360K/usersha1-profile.tsv: User profiles

lastfm_tables

This tutorial uses only interaction data (plays). The files lack headers, which are required. Add a header and trim to 100k samples:

(echo "user_id\tartist_id\tartist_name\tplays"; head -n 100000 lastfm-dataset-360K/usersha1-artmbid-artname-plays.tsv) > lastfm-dataset-360K/user-artist-plays-100k.tsv

The full LastFM dataset contains approximately 17 million events. This example uses 100k events to reduce processing time. Verify the full dataset size:

wc -l lastfm-dataset-360K/usersha1-artmbid-artname-plays.tsv

Create the table

Create a table and insert play records using create-table-from-uri:

CLI
shaped create-table-from-uri --name lastfm_plays --path lastfm-dataset-360K/user-artist-plays-100k.tsv --type tsv

Records upload in batches of 1000. Wait until all 100k records are uploaded.

Create the engine

This example uses play counts to build a collaborative filtering engine. Higher play counts indicate stronger user preference.

Engine configuration:

lastfm_artist_recommendations.yaml
data:
interaction_table:
type: query
query: |
SELECT user_id, artist_id AS item_id, 0 AS created_at, plays AS label
FROM lastfm_plays
training:
models:
- name: als
policy_type: als

Create the engine:

shaped create-engine --file lastfm_artist_recommendations.yaml

For details on engine configuration, see Engines documentation.

Monitor engine status

Engine creation and training can take several hours, depending on data volume and attributes. Check status:

shaped list-engines

Response:

[
"engines": {
"created_at": "2024-05-15T08:55:23 UTC",
"engine_name": "lastfm_artist_recommendations",
"engine_uri": "https://api.shaped.ai/v2/engines/lastfm_artist_recommendations",
"status": "FETCHING",
}
]

The engine progresses through these stages:

  1. SCHEDULING
  2. FETCHING
  3. TRAINING
  4. DEPLOYING
  5. ACTIVE

Once the status is ACTIVE, the engine is ready for queries.

Query recommendations

Query recommendations using the Query endpoint. Provide a user_id and the number of results to return.

Using the CLI:

shaped query --engine-name lastfm_artist_recommendations \
--query "SELECT * FROM similarity(embedding_ref='als', limit=50, encoder='precomputed_user', input_user_id='\$user_id') LIMIT 5" \
--parameters '{"user_id": "00000c289a1829a808ac09c00daf10bc3c4e223b"}'

Response:

{
"results": [
{
"id": "67e344da-ec54-4e26-b2a4-8351d744a14c",
"score": 1.0
},
{
"id": "b7ffd2af-418f-4be2-bdd1-22f8b48613da",
"score": 0.43973369
},
{
"id": "a74b1b7f-71a5-4011-9441-d0b5e4122711",
"score": 0.37249291
},
{
"id": "e7c2d42e-b045-41b6-a391-88f4ea545185",
"score": 0.3511156
},
{
"id": "f2fddf9f-02fd-421a-b5e8-75a3988309ab",
"score": 0.33543342
}
]
}

The response contains an array of result objects with artist IDs and scores.

Using the REST API:

curl https://api.shaped.ai/v2/engines/lastfm_artist_recommendations/query \
-H "x-api-key: <API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT * FROM similarity(embedding_ref=''als'', limit=50, encoder=''precomputed_user'', input_user_id=''$user_id'') LIMIT 5",
"parameters": {
"user_id": "00000c289a1829a808ac09c00daf10bc3c4e223b"
}
}'

Clean up

Delete the table and engine when finished:

shaped delete-engine --engine-name lastfm_artist_recommendations
shaped delete-table --table-name lastfm_plays