Building a database synced to vector search using Cloudflare Workers and Vectorize
1 Nov 2025
In this tutorial I'll show you how to build a vector search engine synced from a traditional SQL database storing music albums. This means you can search your database with natural language prompts and stuff will get found even if there's spelling errors, different synonyms etc., to your base data. It's basically how things like Google Search work.
Meet vectorisation 🔗
A vector database is a means of storing vector representations of data, be it strings (usually), or blob data such as images and audio. A vector is a set of numbers, known as dimensions, that represents an abstract of the "meaning" of that data.
For example, if we vectorise the phrase "I like Pink Floyd", we might get back something like
Vectors for specific pieces of data are known as embeddings, i.e. the above is an embedding for my phrase about Pink Floyd. We derive these from a special AI model that is trained on lots of language and exists solely to turn data into vectors.
All of this is so that we can feed the DB a vectorised search query, and it will then return us the result(s) that it deems most "similar" to that query, each with a confidence score.
Determining similarity 🔗
Skip this bit if you don't care about the internals of vector databases and the maths governing them.
How vector DBs determine similarity is complex and mostly beyond the scope of this article. If you're really interested, check out the maths, but in short most vector DBs use something called cosine similarity, which involves comparing the 'angles' between different vectors as a means to determine similarity. (There are other means, but cosine is the most common.)
I put 'angles' in inverted commas because the term here is meant in an abstract sense. Embeddings are stored in a high-dimensional space, with each embeddng representing a dot in that space.
Imagine two people are both pointing north. We might conclude that they are similar as they're pointing in the same direction. In a vector DB, there are hundreds or thousands of these compass points, rather than just four. In fact, we get to choose this when we set up our DB, which we'll come to shortly.
Set up a Worker 🔗
Let's spin up our Worker. Open a terminal and run:
This will guide you through a series of questions. Answer as follows:
- Start with = Hello World example
- Template = Worker only
- Language = JavaScript
- Version control = no
- Deploy = no
Now enter the newly-created directory.
Let's install Itty Router just to make routing simple.
Finally, open up the wrangler.jsonc file and replace its contents with the following:
The ai property is a binding, a means of giving our Worker access to different Cloudflare services - in this case, the Workers AI tools. We'll revisit this file shortly when we've set up our SQL and vector DBs.
Finally, let's set up a scaffold. Open up src/index.js and replace its contents with the following:
//prep
import { AutoRouter } from 'itty-router'
const router = AutoRouter()
const embeddingModel = '@cf/google/embeddinggemma-300m'
//routes...
//TODO
//export router
export default router
Set up a database
Now let's set up our source-of-truth SQL database, albums_db, which will store albums of different artists and genres. We'll use Cloudflare's D1 product, which is based on SQLite.
Back in the terminal, run:
Once the DB has been created, Wrangler will give you a few prompts. Choose:
- Have Wrangler add DB to config file = Yes
- Binding name = "albums_db" (default)
- During local dev, connect to remote resource = Yes
With the DB set up, let's build and populate it. To keep things simple, we'll be using just one table, albums, which will contain all data associated with an album, including foreign data that we would normally store in separate tables. This makes our DB higly non-normalised, but it's fine for this tutorial.
Download this SQL file I've prepared, save it to the vec-demo project directory, then run it via the following:
That will set up and stock an albums table with 50 albums.
Set up a vector database 🔗
Now we'll set up our Vectorize database, albums_vec, which will store our embeddings related to entries in our SQL database and which will be searched for results similar to the user's query.
Back in the terminal, run:
You'll get the same prompts you got when setting up the SQL DB. Choose:
- Have Wrangler add index to config file = Yes
- Binding name = "albums_vec"
- During local dev, connect to remote resource = Yes
That'll create an index tied to cosine similarity and which expects vectors in 768 dimensions. I talked briefly about this earlier, but Cloudflare has as good page of info on both these factors. If they're confusing, you don't need to worry too much; cosine + 768 is very common combo. You don't normally need a higher number of dimensions unless your embeddings are very, very verbose.
There's two important things to bear in mind, though. Firstly, neither of these factors can be changed after index creation.
Secondly, vector indexes (DBs) and models that generate vectors must use identical dimension sizes. So in other words, if we later obtain vectors from a model that generates vectors using 1024 dimensions, we won't be able to insert them into our index.
Set up meta indexes 🔗
Now that we have a vector index (database), we'll want to add some meta indexes.
A meta index sits alongside our vector index and stores a single piece of data associated with a given vector. Meta indexes can then be used as filters when querying the vector database. Cloudflare provides a number of filter operations for this, beyond simple equality.
Meta indexes are not required to merely store meta against vectors, merely to filter by them at the point of querying.
For example, if you stored albums, artists, genres, labels etc. in the same vector DB, you may want to query items of only one of those kinds.
In our case, because the query will come from a user search, it's hard to know in advance what to filter on, so meta indexes aren't actually all that useful to us. (Plus, we're using only albums in this tutorial.) Basically, the more control you have over the vector query process, the more you'll want to stock metadata and filter on it.
But just to show how it's done, let's set up a meta index to store the item type (always "albums" in our case):
Run the following:
Now that we've done setting up architecture, let's run our Worker! Run:
Your Worker will be available on http://localhost:8787, but right now there's no routes to visit. Let's change that.
Vectorising our data
Now we need to add vectors to our vector DB that correspond to the albums in our SQL DB.
You'd also want to do this for each future album added, perhaps via an "add album" endpoint that receives a form submission with the album details.
Designing our embedding 🔗
First, we need to design our embedding - the text representation of a given album that we'll be converting to a vector and storing in our vector DB.
There's various schools of thought on how best to structure an embedding; it's generally best to make heavy use of meta indexes and keep embedding text minimal. That's not an option for us, because we can't know what the user will be searching for - or if this were embedded in a chat interace, we couldn't know what the user will ask.
So with meta indexes out, that means all we have is the embedding text itself. So we'll need to make sure it contains all key data - and in a simple, syntax-light structured format (vector DBs don't really understand things like JSON). Let's go with:
Adding the vectors 🔗
Open up index.js and, in the //routes... part, add a new route, as follows:
//create (or update) vectors of albums in SQL DB
router.get('/add-vectors', async (req, env) => {
//embedding template
const embeddingTmplt = [
'type: albums',
'id: {id}',
'name: {name}',
'artist: {artist}',
'year: {year}',
'genre: {genre}'
].join(' | ')
//get albums in our SQL DB - `all()` returns all results
const albums = await env.albums_db.exec(`SELECT * FROM albums;`).all()
//for each, generate the embedding text then upsert the vector
//the vector ID will be `album_<rowId>`. This may take ~30 secs
for (const row of albums.results) {
const text = Object.entries(row).reduce((acc, [col, val]) => acc.replace(`{${col}}`, val), embeddingTmplt)
const vector = await env.AI.run(embeddingModel, {text})
await env.albums_vec.upsert([{
id: 'album_'+row.id,
values: vector.data[0],
metadata: {id: row.id, type: 'albums', text}
}])
}
return 'Done!'
})
The comments should make it pretty clear what's going on there. We're defining an Itty route, whose handler is passed the request (req) and environment (env) objects (all of our bindings are available on env). The handler then fetches the albums in our D1 DB and, for each one, generates embedding text and a vector based on it, then upserts that vector into our vector DB.
Upsert logic is standard, i.e. based on the ID; all vectors must have a unique ID, and if you upsert and an item exists with the same ID, it'll update it.
A few things to note:
- The embedding model returns a nested array, with the outer array corresponding to the texts we passed it (we're passing just one) and the inner array the dimensions for that text's vector
- We're storing the item ID and type as meta. Only the latter corresponds to a meta
- We're storing the original embedding text itself on the meta. The reason for this will become clear once we look at
Head to:
...it may take up to a minute to finish, but you should see "Done!" if all goes to plan.
Querying the vector database
Now that we've got vectors, let's query them! Let's set up a new route in our Worker that receives an incoming plain-text query via the query string parameter q, converts it to a vector and then uses that vector as the basis for a query on our vector DB.
//query the vector DB
router.get('/query', async (req, env) => {
//grab the query and convert it to a vector
const query = decodeURIComponent(req.query.q)
const vector = await env.AI.run(embeddingModel, {text: query})
//query the vector DB
const matches = await env.albums_vec.query(vector.data[0], {
topK: 5, //<-- max results
returnMetadata: 'all',
returnValues: false
})
//no matches?
if (!matches.count) return 'Sorry, nothing found!'
//filter out low-score matches (<.4)
const filtered = matches.matches.filter(match =>
match.score >= .4
)
//otherwise, resolve the matches against items in the SQL DB
const ids = filtered.map(item => item.metadata.id)
const albums = await env.albums_db.prepare(`
SELECT *
FROM albums
WHERE id IN (${ids.join()})
`).all()
//merge the vector scores into the SQL results
albums.results.forEach(row => {
const match = filtered.find(match => match.metadata.id == row.id)
row._vec_score = match.score
})
return albums.results
})
Let's run it! Head to:
...and hopefully you'll see some album suggestions. A few important notes here.
- Like when we inserted vectors, the query itself must be vectorised. This is because a vector DB compares vectors to vectors; it doesn't deal in plain text at all
- I've picked a minimum score threshold of .4; this is arbitrary, but I find it's a good starting point, particularly with small datasets.
- Bad results are to be expected, as I toucehd on earlier. There may be only a tiny difference in score between a good match and a bad one. Don't worry; we'll deal with bad matches in the next step.
- We're resolving the vector results against their counterparts in the source-of-truth SQL DB. For this, we need the metadata stored against the vectors (hence
returnMetadata: 'all') - We could sqip the SQL look-up and just go to output straight from the vector results, but I like to resolve against the source-of-truth SQL DB, not least because, in a real-world scenario, you probably won't have stored everything you want to use in output on the vector metadata. In fact, you shouldn't store any more than necessary for successful querying, otherwise that's more data you have to keep in sync between the two DBs.
- It's not currently possible to specify a minimum score at query time, hence we have to filter afterwards
Using a re-ranker LLM (optional) 🔗
One common trick with vectorisation is to run the results through an ordinary LLM to strip out or re-score the results. This is particularly useful in smaller datasets where the vector DB hasn't really got too much to go on for effective comparison.
For this we'll need something called structured output. Structured output (loosely connected to AI functions, which I covered previously), is basically the key to all that's impressive in the world of AI. It involves passing a JSON schema to an LLM, so it response according to that schema, not just conversationally.
We can use any major AI platform, but I like Mistral, so go to Mistral and get yourself a free API key. Then, stop your worker and install LangChain, an AI toolbelt that makes structured output easy.
Restart your Worker, then, at the top of your index.js file, import the LangChain's Mistral packages we just installed:
Now let's extend the /query route. Replace the current return line with the following:
//set prompt and schema (instructing LLM to remove bad matches)
const prompt = `
Here are some results from a vector DB based on the query in the user message.
Return an array of IDs (the \`id\` property) of results relating to results which do not match, or poorly match, the user query.
Results follow:
----
${JSON.stringify(matches.matches.map(({metadata}) =>
({id: metadata.item_id, result: metadata.text})
))}
`
const schema = {
'$schema': 'http://json-schema.org/draft-07/schema',
type: 'object',
properties: {
ids: {
type: 'array',
minItems: 0,
items: {
description: `The ID of the result to remove because it's a bad match`,
type: 'number'
}
}
}
}
//prep request
const instance = new ChatMistralAI({
model: 'mistral-small-latest',
temperature: 0,
apiKey: 'YOUR_API_KEY',
maxRetries: 5
})
const runnable = instance.withStructuredOutput(schema)
//send request
const badIds = await runnable.invoke(prompt)
//return data, removing any bad IDs first
return albums.results.filter(row => !badIds.ids?.includes(row.id))
What's happening there is we're showing the LLM the ID and original embedding text for each match (the whole thing JSON-encoded), and asking it to give us back an array of IDs of bad matches. We're enforcing the response structure (the array) by dictating a JSON schema.
Finally, we return the matches, first filtering out any the re-ranker LLM said we should.
Next steps
And that's it. How can this be improved?
- Authentication, obviously.
- An endpoint to add more albums, and edit existing ones, each time making sure to upsert the new/edited vectors
- Possibly use namespaces to hard-separate vectors of different types in the same index
Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!
