Skip to main content

Quickstart

Shaped provides three APIs that you should be familiar with to get started:

  1. Dataset API
  2. Model Management API
  3. Rank API

The model and dataset API are used to manage your data sources, and models. They both provide endpoints to create, list and delete your respective dataset or model entities.

The Rank API provides a set of high-performance, real-time endpoints to consume results from your model. These results can include personalized recommendations, semantic search and latent embedding analytics

All endpoints can be accessed through the Shaped's CLI or REST APIs. In this guide we'll run through an example which puts them altogether.

Installing the CLI

To get started using the CLI to create a model, first install the Shaped CLI from PyPi as follows:

pip install shaped
info

Shaped supports Python 3.8+, take a look at the installation instructions if you need to install pip.

If you are having trouble installing due to package conflicts, use a Python virtual environment, especially if you are using the system default Python installation.

Initialize the client

You can then initialize the shaped client with your API key. If you don't have an API key yet, check out this to get one page.

shaped init --api-key <YOUR_API_KEY>

Creating your first Dataset

Let's say you want to build a video recommendation model and you have your event data (containing clicks and impression events) stored in a csv called video_events.csv. The csv has the following columns:

  1. user_id: the user triggering the interaction event.
  2. item_id: the video id that was interacted with.
  3. created_at: the timestamp of the event.
  4. event: the event type with values: 'click' and 'impression'

The first step of using Shaped is ingesting this data within Shaped. To do this, from a local file let's create a dataset with the create-dataset-from-uri cli command:

shaped create-dataset-from-uri --name video_events --uri ./click_events.csv --type csv

Creating your first Model

To create a model from the ingested events we need to write a model definition that includes the SQL transforms needed to choose the user_id, item_id, created_at and label columns. Shaped's event label column expects numerical values, where anything greater than 0 is positive and anything less than or equal to 0 is negative. For this use-case, we'll make clicks: positive (i.e. label=1) and impressions negative (i.e. label=0). Putting it together here's the model definition:

interaction_video_recommendations.yaml
model:
name: interaction_video_recommendations
connectors:
- type: Dataset
id: click_events
name: click_events
fetch:
events: |
SELECT
user_id,
item_id,
created_at,
CASE
WHEN event = 'click' THEN 1
ELSE 0
END as label
FROM click_events
shaped create-model --file interaction_video_recommendations.yaml

Rank API

The Rank API is what's used to retrieve your recommendation results. It's a real-time, high performance endpoint designed to be integrated directly into your application and supports 1000s of requests a second. Much like the Model API we provide a CLI and REST API to make rank requests. The Rank API supports several discovery endpoints and argument combinations that handle your use-case whether it be a recommendation feed, a similar items carousels or personalized search.

Ranking

Here's how you can use the rank endpoint to retrieve personalized results for the user with user_id=3.

shaped rank --model_name interaction_video_recommendations --user_id 3 --limit 5

Example Response:

{
"ids":[
"427010",
"182094",
"332874",
"827918",
"403528"
],
"scores":[
0.9,
0.8,
0.7,
0.3,
0.2
],
}

The response contains a list of "ids", which in this case is the unique identifiers for the video (item_ids) that are most relevant to this user. Because we trained the models with 'click' as the positive interaction, these videos are what we predict the user to most likely want to click next.

The response also contains a parallel list of "scores", which has the respective relevance confidence we have that this item is relevant to the query user. You can use this to get a bit more of an understanding of how the relevancy estimates change throughout the ranking.

To learn more about using and evaluating results from the rank endpoint, take a look at some of specific guides on your use-case.

Next steps

Although this video recommendation model is a good starting place, there's a lot of ways we can improve it with Shaped. Notably,

  1. Enriching your model with user and item attribute features to improve cold-start performance.
  2. Creating personalized item filters that better represent your business logic.
  3. Real-time connectors and session-based ranking

We recommend going through the rest of our guides to find out more!