Overview

To get the most from your data, modern recommendation and ranking methods need all the data they can get. As well as understanding traditional tabular data types such as categorical (enums) and numerical (scalars) variables, Shaped understands complex data types like image, audio, language and video. It does this by processing these data types into embeddings using pre-trained understanding models. These embeddings are then fed into the ranking models to improve its understanding of the input and the the performance of the final ranking.

For example, if you are building a social post recommendation model, the content of the post is crucial in understanding the relevance it has to a user. And these embedding models can understand that content. It is even more crucial when you lack interaction data for that post (e.g. say for a newly created post on the platform). Note, this is called the cold-start problem and will be discussed later.

Below are our the data types that Shaped supports:

Overview

🔠 Categoricals

🔢 Numericals

🕒 Timestamps

🖼️ Images

🔤 Language

📍 Locations

🔗 Cross Features