Skip to main content

Scale requirements

We're continuously increasing our scale constraints to support larger traffic loads.

Our published limits are intentionally conservative — in practice, we routinely support 2k+ QPS, up to ~1B items, and billions of events. If you have higher requirements, please reach out.

How we ingest and process your data at scale

Shaped has a real-time, durable feature-store that ingests, and transforms your data for training and serving.

Data Ingestion Architecture

How we train and encode data continuously

Shaped continuously encodes your data with fresh multi-modal embeddings and trains your retrieval and ranking models with best in class MLOps.

Training Pipeline Architecture

How we serve query results fast

Shaped has a serverless real-time serving system that scales with your requests based on latency, request volume and pod memory.

Inference Architecture

Scale limits

Shaped’s cloud API supports the following scale limits for each tenant:

Dimension
Limit
Unique users50 million
Unique items1 billion
Unique events1+ billion
Personal filters50 million
Requests per second2,000+
Train frequency2 hours
Event and filter ingestion< 30 seconds
User and item catalog ingestion10 minute

Please get in touch if you have specific performance or scale constraints that you want us to meet: Schedule a call