You don't want a Vector Database
You don't want a Vector Database
...…when a Document Database is all you need. Any Document Database that supports Vector Search is likely to suit you better, especially if your use case is supporting a RAG pipeline.
Vector Databases are all the rage
You’ve likely heard of Vector Databases and may even think you need one for that next generation generative AI application you’re building. AWS, Azure and GCP have all started to put out their respective materials to catch the influx of people googling “What is a vector database?”. New standalone Vector Database products have even started to pop up, such as Pinecone, Chroma or Weaviate, along with the inevitable articles and quadrants that try to rank them.
Some of the Vector Databases in Market - Source
All of this interest in Vector Databases is driven by the explosion of Generative AI that ChatGPT sparked (less than 18 months ago!). Specifically, Vector Databases enable the Retrieval Augmented Generation (RAG) paradigm, which is currently the leading form of prompt engineering and actually extracting business value from Large Language Models (LLMs). If you’re interested, we cover how to conceptualise RAG here.
RAG and Embeddings
In RAG, you pass relevant context to the LLM as an additional part of the prompt, which raises the question - how do you find relevant context? It turns out LLMs are only one of the models to come out of the Generative AI revolution - but there is a slightly lesser-known sibling, the Embeddings Model. An Embeddings Model takes text and embeds it in a high dimensional space, effectively creating a vector representation of that text. If done well, texts with similar meanings will have similar vectors, and unrelated texts will have unrelated vectors. Vector similarity can be numerically calculated (for example with cosine similarity), meaning we are able to determine which texts are most relevant (or similar) to each other. When the number of vectors you are searching through becomes large enough, this is usually done with approximate knn, an algorithm finds the k most similar vectors without needing to calculate the distance of each vector individually.
An example of using an LLM with and without RAG - Source
Let us return to RAG for a moment, where we now have the tools to take all of our possible context documents, generate vector embeddings for them, store those vectors in a vector database, and then find the most relevant vectors. But the LLM prompt needs the context as text, not as an embedding vector, so logically you also store the documents in the vector database - and herein lies the crux of the problem. What RAG actually needs is not a Vector Database, but a Document Database. Specifically, that Document Database needs to support Vector Search (sometimes referred to as a “Vector Index”) - meaning it can run approximate knn (or similar) across vector fields.
The cost of another Stateful Service
Before we jump to what you should do, let’s go over why having a Vector Database, now that you don’t really need one, is a bad idea. Firstly, there is the initial inherent cost in having an additional component in your solution architecture. More components are more costly during development, as developers need to understand more moving pieces, but the real cost of additional components is the ongoing management of keeping them updated and maintained. This cost increases exponentially for stateful components (which a Vector Database is), as things like backups, migrations, and data governance need to be thought through as well.
To add insult to injury, in the case of Vector Databases for RAG, the Vector Database isn’t the primary store of the documents. This gives you a classic conflicting sources of truth problem, resulting in things like out-dated documents being used to give your users wrong answers to their queries.
The Alternatives
The way you should build RAG then, is to store your context in a Document Database that supports Vector Search. This is great news as Document Databases are known quantities to DevOps teams. They’ve been around for many years - and you may already have one in your existing architecture! But surely needing to support Vector Search is restrictive? Nope - all the big names fit the bill. PostgreSQL, Cassandra, MongoDB all support Vector Search. It isn’t just me claiming this, there are research papers on whether Vector Databases are worth the cost, and even AWS points you to use their existing database services rather than offering a standalone one.
What about if you don’t have a Document Database in your architecture, should you use a Vector Database then? - still no. Apart from being much more mature technology, Document Databases enable you to mix in traditional queries and filters, in an approach known as hybrid search. You should either choose an open source or managed Document Database to use, or, if you’re looking for a more managed RAG solution (with Embeddings Models and Vector Search all managed for you), checkout Redactive.