Table & View Basics

Tables and views form the foundation of the data layer in Shaped. Tables are the foundation—they store your raw data from production sources via connectors. Views build on tables by transforming and enriching that data through SQL queries or AI-powered enrichment.

A typical workflow: connect your data source → create a table → optionally create views to transform, join, or enrich the data → use tables and views in engines for training and retrieval.

Tables

Tables are the core component of the data layer in Shaped.

Each table is populated via a connector to a source, such as a MySQL database or Amplitude account.

Connectors

Connectors establish a connection to where your data already lives. You can use a custom connector to stream data directly into Shaped using the Shaped API or Shaped CLI.

Built-in Connectors

Shaped supports out-of-the-box connectors to a range of systems, including:

Data Warehouses: Such as BigQuery, Snowflake, and Redshift.
Databases: Such as PostgreSQL and MySQL.
Blob Storage: Such as S3 and Google Cloud Storage (GCS).
Analytics Applications: Such as Segment and Amplitude.

Custom connectors

If your datasource is not supported, you can declare a custom table.

Use the "Create table" endpoint to declare a new table and its schema.

Then use a /table/insert endpoint to add records to the table.

Connector Types

Shaped supports both real-time (streaming) and batch connectors. Design your data architecture with the difference in mind:

Real-time connectors: Stream data continuously (e.g., Segment, Kafka, Kinesis). Data appears in Shaped within 30 seconds.
Batch connectors: Sync data on a schedule, typically every 15 minutes (e.g., BigQuery, PostgreSQL, Snowflake).

For more details on connector types, their use cases, and limitations, see the Connector Types guide.

Table schemas

Every table has a schema that describes the structure of your data.

Built-in connectors infer the column structure from the datasource you connect.

Custom tables require you to upload a schema. It should include:

columns: The names of the fields in your data.
unique_keys: A list of columns that uniquely identify each row in the table (e.g., ["id"]).

Append-only tables

Some tables are created from event streams and are append-only. When you add rows to such tables, they are added to the bottom.

To ensure deduplication happens correctly, make sure that each record contains the primary_key and replication_key columns defined in your schema.

Views

A view is a transformed view of one or more tables. Views are created by SQL queries or LLM prompts.

SQL view

An SQL view is created by running an SQL query against one or more tables.

SQL Dialect: Views use ClickHouse SQL syntax. All SQL queries in views are executed using ClickHouse SQL, so ClickHouse functions, operators, and syntax are supported. For example:

Use now() (lowercase) instead of NOW()
Use toYear(), toMonth(), etc. instead of EXTRACT(YEAR FROM ...)
Use quantile() for percentile calculations
Use INTERVAL 1 HOUR (no quotes, uppercase unit) instead of INTERVAL '1 hour'

See the ClickHouse SQL documentation for full function reference.

You can use them to:

Filter or aggregate source tables
Join multiple tables
Create derived features from existing columns
Reshape data structures such as JSON

Materialized vs On-Demand Views

Shaped supports two types of SQL views:

Materialized views (default): The view query is executed and the results are stored as a physical table. Materialized views are updated automatically when source tables change.

Pros: Fast query performance, pre-computed results
Cons: Takes storage space, slight delay in updates
Use when: You need fast queries, the view is used frequently, or you're joining large tables

On-demand views: The view query is executed each time the view is accessed. Results are computed on-the-fly.

Pros: Always up-to-date, no storage overhead
Cons: Slower query performance, recomputes on every access
Use when: Data changes frequently, you need real-time results, or storage is a concern

To create an on-demand view, set sql_view_type: "VIEW":

SELECT *
FROM items
WHERE updated_at > now() - INTERVAL 1 HOUR

POST /v2/views
{
  "name": "realtime_items",
  "view_type": "SQL",
  "sql_view_type": "VIEW",
  "sql_query": "SELECT * FROM items WHERE updated_at > now() - INTERVAL 1 HOUR"
}

By default, views use sql_view_type: "MATERIALIZED_VIEW" for optimal performance.

Example: Join two tables

Python
HTTP

import requests
import json

url = "https://api.shaped.ai/v2/views"

payload = json.dumps({
  "name": "hm_items_with_descriptions",
  "view_type": "SQL",
  "sql_query": "select * from hm_items items left join hm_items_with_descriptions ai_items on ai_items.item_id = items.item_id"
})
headers = {
  'Content-Type': 'application/json',
  'x-api-key': 'LmBjMxgOtV1js4Zf3jiqw1VhnWfa7I1u9jBLSJeg'
}

response = requests.request("POST", url, headers=headers, data=payload)

POST /v2/views HTTP/1.1
Host: api.shaped.ai
Content-Type: application/json
x-api-key: LmBjMxgOtV1js4Zf3jiqw1VhnWfa7I1u9jBLSJeg
Content-Length: 236

{
    "name": "hm_items_all_columns",
    "view_type": "SQL",
    "sql_query" : "select * from hm_articles items left join hm_items_with_descriptions ai_items on ai_items.item_id = items.item_id"
}

AI view

An AI view is created by running an LLM prompt against the records in a table. They take columns from a source table, apply an LLM with a prompt, and output new columns with enriched information.

Use them specifically to enrich data that will power semantic search.

Learn more in our AI enrichment guide - Enrich your tables with AI

Tables​

Connectors​

Built-in Connectors​

Custom connectors​

Connector Types​

Table schemas​

Append-only tables​

Views​

SQL view​

Materialized vs On-Demand Views​

Example: Join two tables​

AI view​