Skip to main content

Key concepts

Shaped is a retrieval database that enables you to build search and recommendation systems. Any Shaped retrieval database consists of four interconnected layers:

  • Tables & Views: Act as a data warehouse and connect to your production data sources, then transform and consolidate your tables into materialized views
  • Engines: Train retrieval and ranking models on your data using a declarative interface
  • Queries: Retrieve records using your ranking models through a REST API

Entity overview

Shaped consists of four main entities that work together to build retrieval systems:

  1. Tables are the data warehouse entities that connect to your existing data sources (databases, data warehouses, blob storage, or analytics platforms) via connectors. Each table stores raw data with a defined schema and can be populated through built-in connectors or custom connectors using the API.

  2. Views are transformed, materialized representations of one or more tables. SQL views allow you to join, filter, and aggregate data from multiple tables using SQL queries. AI views use LLM prompts to enrich data by adding new columns or semantic information, which is particularly useful for semantic search applications.

  3. Engines defines how data is indexed and ranked. An engine configuration specifies:

    • Data: Which tables to use (item tables for candidates, user tables for personalization, interaction tables for training signals)
    • Index: How to create search indexes (embeddings for vector search, lexical search for BM25)
    • Training: Which scoring models to train (popularity-based, chronological, or machine learning models like ELSA, LightGBM, BERT4Rec)
    • Deployment: How the engine is deployed and scaled
    • Queries: Saved query templates for consistent retrieval
  4. Queries are database queries run via a REST API. Queries can be written in SQL or with a JSON query interface.

    • Retrieve: Fetch candidates using similarity search, lexical search, or ranking columns
    • Filter: Remove candidates based on business rules
    • Score: Rank the candidates using trained models or embedding similarity
    • Reorder: Apply diversity or exploration to improve result quality

How the layers work together

Shaped operates as a retrieval database where each layer builds upon the stored data: Tables store your data from production sources via connectors, acting as the foundational data warehouse. Views provide queryable, materialized representations of your tables that you can query and transform. Engines create indexes and ranking models on the data stored in your tables and views, similar to how a traditional database creates indexes for efficient retrieval. Queries then retrieve and rank records from your engines in real-time, just as you would query a database.

For example, you might store a product catalog in a Table, create a View that joins product data with enriched descriptions, configure an Engine that builds indexes and trains a recommendation model on user interactions stored in your tables, and then use Queries to retrieve personalized product recommendations for each user.