Skip to main content

Semantic search

This guide describes how you can implement semantic search using the Shaped engine and query APIs.

First, instantiate the Shaped client:

from shaped import Client
shaped = Client(api_key="YOUR_KEY_HERE")

Upload data

To do semantic search, you need to first connect to a data table. In this example, you'll declare a table and upload some rows of data manually.

Alternatively, you can sync data from an external data source using a connector, or simply import a JSONL or CSV file.

The first step is to declare your data table. We'll use a CUSTOM schema so that we can define the table's input columns ourselves and write records manually.

table_config = {
"schema_type": "CUSTOM",
"name": "pixar_movies",
"column_schema": {
"item_id": "Int64",
"movie_title": "String",
"poster_url": "String",
"description": "String",
"release_date": "String",
"cast": "Array(String)",
},
}

shaped.create_table(table_config)

Once the table schema is created, you can upload your data directly. You'll declare the rows as an object and then use the table insert method to write to the table.

records = [
{"item_id": 187541, "movie_title": "Incredibles 2 (2018)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTEzNzY0OTg0NTdeQTJeQWpwZ15BbWU4MDU3OTg3MjUz._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "The Incredibles family takes on a new mission which involves a change in family roles: Bob Parr (Mr. Incredible) must manage the house while his wife Helen (Elastigirl) goes out to save the world.", "release_date": "2018-06-15", "cast": ["Craig T. Nelson", "Holly Hunter", "Sarah Vowell", "Huck Milner", "Catherine Keener", "Eli Fucile", "Bob Odenkirk", "Samuel L. Jackson", "Michael Bird", "Sophia Bush", "Brad Bird", "Brad Bird", "Nicole Paradis Grindle", "John Walker", "Michael Giacchino", "Stephen Schaffer", "Natalie Lyon", "Kevin Reher", "Ralph Eggleston"]},
{"item_id": 177765, "movie_title": "Coco (2017)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMDIyM2E2NTAtMzlhNy00ZGUxLWI1NjgtZDY5MzhiMDc5NGU3XkEyXkFqcGc@._V1_QL75_UY562_CR7,0,380,562_.jpg", "description": "Aspiring musician Miguel, confronted with his family's ancestral ban on music, enters the Land of the Dead to find his great-great-grandfather, a legendary singer.", "release_date": "2017-11-22", "cast": ["Anthony Gonzalez", "Gael García Bernal", "Benjamin Bratt", "Alanna Ubach", "Renee Victor", "Jaime Camil", "Alfonso Arau", "Herbert Siguenza", "Gabriel Iglesias", "Lombardo Boyar", "Lee Unkrich", "Lee Unkrich", "Jason Katz", "Matthew Aldrich", "Adrian Molina", "Darla K. Anderson", "Michael Giacchino", "Steve Bloom", "Lee Unkrich", "Carla Hool", "Natalie Lyon", "Kevin Reher", "Harley Jessup"]},
{"item_id": 170957, "movie_title": "Cars 3 (2017)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTc0NzU2OTYyN15BMl5BanBnXkFtZTgwMTkwOTg2MTI@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "Lightning McQueen sets out to prove to a new generation of racers that he's still the best race car in the world.", "release_date": "2017-06-16", "cast": ["Owen Wilson", "Cristela Alonzo", "Chris Cooper", "Nathan Fillion", "Larry the Cable Guy", "Armie Hammer", "Ray Magliozzi", "Tony Shalhoub", "Bonnie Hunt", "Lea DeLaria", "Brian Fee", "Brian Fee", "Ben Queen", "Eyal Podell", "Jonathon E. Stewart", "Kevin Reher", "Randy Newman", "Jason Hudak", "Natalie Lyon", "Kevin Reher", "William Cone", "Jay Shuster"]},
{"item_id": 157296, "movie_title": "Finding Dory (2016)", "poster_url": "https://m.media-amazon.com/images/M/MV5BY2VlYWJjMGMtYjcwZC00MDE2LThmMDItYjVlMzNhYzBhYTk5XkEyXkFqcGc@._V1_QL75_UY562_CR7,0,380,562_.jpg", "description": "Friendly but forgetful blue tang Dory begins a search for her long-lost parents and everyone learns a few things about the real meaning of family along the way.", "release_date": "2016-06-17", "cast": ["Ellen DeGeneres", "Albert Brooks", "Ed O'Neill", "Kaitlin Olson", "Hayden Rolence", "Ty Burrell", "Diane Keaton", "Eugene Levy", "Sloane Murray", "Idris Elba", "Andrew Stanton", "Andrew Stanton", "Victoria Strouse", "Lindsey Collins", "Thomas Newman", "Axel Geddes", "Natalie Lyon", "Kevin Reher", "Steve Pilcher"]},
{"item_id": 136016, "movie_title": "The Good Dinosaur (2015)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTc5MTg2NjQ4MV5BMl5BanBnXkFtZTgwNzcxOTY5NjE@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "In a world where dinosaurs and humans live side-by-side, an Apatosaurus named Arlo makes an unlikely human friend.", "release_date": "2015-11-25", "cast": ["Jeffrey Wright", "Frances McDormand", "Maleah Nipay-Padilla", "Ryan Teeple", "Jack McGraw", "Marcus Scribner", "Raymond Ochoa", "Jack Bright", "Peter Sohn", "Steve Zahn", "Peter Sohn", "Bob Peterson", "Peter Sohn", "Erik Benson", "Meg LeFauve", "Denise Ream", "Jeff Danna", "Mychael Danna", "Stephen Schaffer", "Natalie Lyon", "Kevin Reher", "Harley Jessup", "Daniel Lopez Muñoz"]},
{"item_id": 134853, "movie_title": "Inside Out (2015)", "poster_url": "https://m.media-amazon.com/images/M/MV5BOTgxMDQwMDk0OF5BMl5BanBnXkFtZTgwNjU5OTg2NDE@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "After young Riley is uprooted from her Midwest life and moved to San Francisco, her emotions, Joy, Fear, Anger, Disgust, and Sadness, conflict on how best to navigate a new city, house, and school.", "release_date": "2015-06-19", "cast": ["Amy Poehler", "Bill Hader", "Lewis Black", "Mindy Kaling", "Phyllis Smith", "Richard Kind", "Kaitlyn Dias", "Diane Lane", "Kyle MacLachlan", "Paula Poundstone", "Pete Docter", "Pete Docter", "Ronnie Del Carmen", "Meg LeFauve", "Josh Cooley", "Jonas Rivera", "Michael Giacchino", "Kevin Nolting", "Natalie Lyon", "Kevin Reher", "Ralph Eggleston"]},
{"item_id": 103141, "movie_title": "Monsters University (2013)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTUyODgwMDU3M15BMl5BanBnXkFtZTcwOTM4MjcxOQ@@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "A look at the relationship between Mike Wazowski and James P. \"Sully\" Sullivan during their days at Monsters University, when they weren't necessarily the best of friends.", "release_date": "2013-06-21", "cast": ["Billy Crystal", "John Goodman", "Steve Buscemi", "Helen Mirren", "Peter Sohn", "Joel Murray", "Sean Hayes", "Dave Foley", "Charlie Day", "Alfred Molina", "Dan Scanlon", "Dan Scanlon", "Daniel Gerson", "Robert L. Baird", "Kori Rae", "Randy Newman", "Jean-Claude Kalache", "Greg Snyder", "Natalie Lyon", "Kevin Reher", "Ricky Nierva"]},
{"item_id": 95167, "movie_title": "Brave (2012)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMzgwODk3ODA1NF5BMl5BanBnXkFtZTcwNjU3NjQ0Nw@@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "Determined to make her own path in life, Princess Merida defies a custom that brings chaos to her kingdom. Granted one wish, Merida must rely on her bravery and her archery skills to undo a beastly curse.", "release_date": "2012-06-22", "cast": ["Kelly Macdonald", "Billy Connolly", "Emma Thompson", "Julie Walters", "Robbie Coltrane", "Kevin McKidd", "Craig Ferguson", "Sally Kinghorn", "Eilidh Fraser", "Peigi Barker", "Mark Andrews", "Brenda Chapman", "Brenda Chapman", "Mark Andrews", "Steve Purcell", "Irene Mecchi", "Katherine Sarafian", "Patrick Doyle", "Nicholas C. Smith", "Natalie Lyon", "Kevin Reher", "Steve Pilcher"]},
{"item_id": 87876, "movie_title": "Cars 2 (2011)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTUzNTc3MTU3M15BMl5BanBnXkFtZTcwMzIxNTc3NA@@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "Star race car Lightning McQueen and his pal Mater head overseas to compete in the World Grand Prix race. But the road to the championship becomes rocky as Mater gets caught up in an intriguing adventure of his own: international espionage.", "release_date": "2011-06-24", "cast": ["Owen Wilson", "Larry the Cable Guy", "Michael Caine", "Emily Mortimer", "Eddie Izzard", "John Turturro", "Brent Musburger", "Joe Mantegna", "Thomas Kretschmann", "Peter Jacobson", "John Lasseter", "John Lasseter", "Bradford Lewis", "Dan Fogelman", "Ben Queen", "Denise Ream", "Michael Giacchino", "Stephen Schaffer", "Natalie Lyon", "Kevin Reher", "Harley Jessup"]},
{"item_id": 68954, "movie_title": "Up (2009)", "poster_url": "https://m.media-amazon.com/images/M/MV5BNmI1ZTc5MWMtMDYyOS00ZDc2LTkzOTAtNjQ4NWIxNjYyNDgzXkEyXkFqcGc@._V1_QL75_UX380_CR0,1,380,562_.jpg", "description": "78-year-old Carl Fredricksen travels to South America in his house equipped with balloons, inadvertently taking a young stowaway.", "release_date": "2009-05-29", "cast": ["Edward Asner", "Jordan Nagai", "John Ratzenberger", "Christopher Plummer", "Bob Peterson", "Delroy Lindo", "Jerome Ranft", "David Kaye", "Elie Docter", "Jeremy Leary", "Pete Docter", "Pete Docter", "Bob Peterson", "Tom McCarthy", "Jonas Rivera", "Michael Giacchino", "Kevin Nolting", "Natalie Lyon", "Kevin Reher", "Ricky Nierva"]},
{"item_id": 50872, "movie_title": "Ratatouille (2007)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTMzODU0NTkxMF5BMl5BanBnXkFtZTcwMjQ4MzMzMw@@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "A rat who can cook makes an unusual alliance with a young kitchen worker at a famous Paris restaurant.", "release_date": "2007-06-29", "cast": ["Brad Garrett", "Lou Romano", "Patton Oswalt", "Ian Holm", "Brian Dennehy", "Peter Sohn", "Peter O'Toole", "Janeane Garofalo", "Will Arnett", "Julius Callahan", "Brad Bird", "Brad Bird", "Jan Pinkava", "Jim Capobianco", "Emily Cook", "Bradford Lewis", "Michael Giacchino", "Darren T. Holmes", "Natalie Lyon", "Kevin Reher", "Harley Jessup"]},
{"item_id": 45517, "movie_title": "Cars (2006)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTg5NzY0MzA2MV5BMl5BanBnXkFtZTYwNDc3NTc2._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "On the way to the biggest race of his life, a hotshot rookie race car gets stranded in a rundown town and learns that winning isn't everything in life.", "release_date": "2006-06-09", "cast": ["Owen Wilson", "Bonnie Hunt", "Paul Newman", "Larry the Cable Guy", "Cheech Marin", "Tony Shalhoub", "Guido Quaroni", "Jenifer Lewis", "Paul Dooley", "Michael Wallis", "John Lasseter", "John Lasseter", "Joe Ranft", "Jorgen Klubien", "Dan Fogelman", "Darla K. Anderson", "Randy Newman", "Jean-Claude Kalache", "Ken Schretzmann", "Kevin Reher", "William Cone", "Bob Pauley"]},
{"item_id": 8961, "movie_title": "The Incredibles (2004)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTY5OTU0OTc2NV5BMl5BanBnXkFtZTcwMzU4MDcyMQ@@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "While trying to lead a quiet suburban life, a family of undercover superheroes are forced into action to save the world.", "release_date": "2004-11-05", "cast": ["Craig T. Nelson", "Samuel L. Jackson", "Holly Hunter", "Jason Lee", "Dominique Louis", "Theodore Newton", "Jean Sincere", "Eli Fucile", "Maeve Andrews", "Wallace Shawn", "Brad Bird", "Brad Bird", "John Walker", "Michael Giacchino", "Andrew Jimenez", "Patrick Lin", "Janet Lucroy", "Stephen Schaffer", "Matthew Jon Beck", "Mary Hidalgo", "Kevin Reher", "Jen Rudin", "Lou Romano"]},
{"item_id": 6377, "movie_title": "Finding Nemo (2003)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTc5NjExNTA5OV5BMl5BanBnXkFtZTYwMTQ0ODY2._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "After his son is captured in the Great Barrier Reef and taken to Sydney, a timid clownfish sets out on a journey to bring him home.", "release_date": "2003-05-30", "cast": ["Albert Brooks", "Ellen DeGeneres", "Alexander Gould", "Willem Dafoe", "Brad Garrett", "Allison Janney", "Austin Pendleton", "Stephen Root", "Vicki Lewis", "Joe Ranft", "Andrew Stanton", "Andrew Stanton", "Bob Peterson", "David Reynolds", "Graham Walters", "Thomas Newman", "Sharon Calahan", "Jeremy Lasky", "David Ian Salter", "Matthew Jon Beck", "Carl Davies", "Mary Hidalgo", "Kevin Reher", "Ralph Eggleston"]},
{"item_id": 4886, "movie_title": "Monsters, Inc. (2001)", "poster_url": "https://m.media-amazon.com/images/M/MV5BMTY1NTI0ODUyOF5BMl5BanBnXkFtZTgwNTEyNjQ0MDE@._V1_QL75_UX380_CR0,0,380,562_.jpg", "description": "In order to power the city, monsters have to scare children so that they scream. However, the children are toxic to the monsters, and after a child gets through, two monsters realize things may not be what they think.", "release_date": "2001-11-02", "cast": ["Billy Crystal", "John Goodman", "Mary Gibbs", "Steve Buscemi", "James Coburn", "Jennifer Tilly", "Bob Peterson", "John Ratzenberger", "Frank Oz", "Daniel Gerson", "Pete Docter", "Pete Docter", "Jill Culton", "Jeff Pidgeon", "Ralph Eggleston", "Darla K. Anderson", "Randy Newman", "Jim Stewart", "Matthew Jon Beck", "Mary Hidalgo", "Ruth Lambert", "Harley Jessup", "Bob Pauley"]},
{"item_id": 3114, "movie_title": "Toy Story 2 (1999)", "poster_url": "https://m.media-amazon.com/images/M/MV5BNzVmODlhMDEtY2YxZi00OTVjLTlkNTktN2Q2OTRlM2I4M2FhXkEyXkFqcGc@._V1_QL75_UY562_CR1,0,380,562_.jpg", "description": "When Woody is stolen by a toy collector, Buzz and his friends set out on a rescue mission to save Woody before he becomes a museum toy property with his roundup gang Jessie, Prospector, and Bullseye.", "release_date": "1999-11-24", "cast": ["Tom Hanks", "Tim Allen", "Joan Cusack", "Kelsey Grammer", "Don Rickles", "Jim Varney", "Wallace Shawn", "John Ratzenberger", "Annie Potts", "Wayne Knight", "John Lasseter", "John Lasseter", "Pete Docter", "Ash Brannon", "Andrew Stanton", "Karen Robert Jackson", "Helene Plotkin", "Randy Newman", "Sharon Calahan", "Edie Ichioka", "David Ian Salter", "Lee Unkrich", "Mary Hidalgo", "Ruth Lambert", "William Cone", "Jim Pearson"]},
{"item_id": 1, "movie_title": "Toy Story (1995)", "poster_url": "https://m.media-amazon.com/images/M/MV5BZTA3OWVjOWItNjE1NS00NzZiLWE1MjgtZDZhMWI1ZTlkNzYwXkEyXkFqcGc@._V1_QL75_UX380_CR0,2,380,562_.jpg", "description": "A cowboy doll is profoundly jealous when a new spaceman action figure supplants him as the top toy in a boy's bedroom. When circumstances separate them from their owner, the duo have to put aside their differences to return to him.", "release_date": "1995-11-22", "cast": ["Tom Hanks", "Tim Allen", "Don Rickles", "Jim Varney", "Wallace Shawn", "John Ratzenberger", "Annie Potts", "John Morris", "Erik von Detten", "Laurie Metcalf", "John Lasseter", "John Lasseter", "Pete Docter", "Andrew Stanton", "Joe Ranft", "Bonnie Arnold", "Ralph Guggenheim", "Randy Newman", "Robert Gordon", "Lee Unkrich"]},
]

shaped.insert_table_rows("pixar_movies", records)

Set up your engine

Now you will configure the semantic search engine. The engine will encode text columns in our dataset into vectors for later search.

Start by instantiating the engine configuration class:

semantic_search_engine = EngineConfigV2(
name="semantic_search",
data=DataConfig(), # empty data config to populate later
)

Connect engine to data

Engines must connect to an item table, which define the candidate items in your catalog. Connect the table you just created (pixar_movies) to the reference config.

Use the ReferenceTableConfig type to connect to the table with no additional filtering or query logic.

semantic_search_engine.data = DataConfig(
item_table=DataConfigInteractionTable(
ReferenceTableConfig(
name="pixar_movies"
)
)
)

Text encoding

To encode your text, you need to choose which text columns to make searchable and choose an embedding model to generate the encoders.

Semantic search does best on natural language text features. Given your movie data, create vectors for the movie titles and descriptions.

fields_to_encode = ["title", "description"]

Model selection

Embedding models generate vectors from input features. Shaped supports generating embeddings from any sentence transformer model in the Huggingface library.

For this example, use sentence-transformers/all-MiniLM-L6-v2, a small sentence transformer model that will create our embeddings fast.

embedding_model = "sentence-transformers/all-MiniLM-L6-v2"

Now we assign these embeddings to the engine's index configuration:

semantic_search_engine.index=IndexConfig(
embeddings=[
EmbeddingConfig(
name="movie_text_embedding",
encoder=Encoder(
HuggingFaceEncoder(
model_name=embedding_model,
item_fields=fields_to_encode,
)
),
)
]
)

Start encoding process

After configuring your engine's data and index, use the create engine method to start the encoding process:

client.create_engine(engine_config=semantic_search_engine)

Make a search query

After the engine is finished encoding, you can search your Pixar movies using a text query.

Use the text_search retriever with mode vector to do semantic search. You do not need to supply a vector - the input text is automatically converted into a vector.

query_string = """
SELECT *
FROM retrieve(
text_search(
query='$query',
mode='vector',
text_embedding_ref='title_embedding',
limit=50
)
)
LIMIT 200
"""

results = client.execute_query(
engine_name="semantic_search",
query=query_string,
parameters={"query": "Incredibles"},
return_metadata=True,
)