How to build a local retrieval-augmented generation application using Postgres, the pgvector extension, Ollama, and the Llama 3 large language model.
PostgreSQL with the pgvector extension allows tables to be used as storage for vectors, each of which is saved as a row. It also allows any number of metadata columns to be added. In an enterprise application, this hybrid capability of storing both vectors and tabular data provides developers with a flexibility that is not available in pure vector databases.
While pure vector databases can be tuned for extreme high performance, pgvector may not be. However, for medium-sized retrieval-augmented generation (RAG) applications (involving around 100K documents, typically) the performance of pgvector is more than adequate. If you are building a knowledge management application for a group or small department, such an architecture is a simple way to get started. (For even smaller, single-user applications, you could use SQLite with the sqlite-vss extension.)
…