pgvector Store

April 8, 2026 · View on GitHub

Import: from selectools.rag.stores import PgVectorStore Stability: beta Added in: v0.21.0

PgVectorStore lets you store and search document embeddings inside a PostgreSQL database using the pgvector extension. It's the right choice when you already run Postgres and want vectors next to the rest of your application data without standing up a separate vector service.

from selectools.embeddings import OpenAIEmbeddingProvider
from selectools.rag import Document
from selectools.rag.stores import PgVectorStore

embedder = OpenAIEmbeddingProvider()
store = PgVectorStore(
    embedder=embedder,
    connection_string="postgresql://user:pass@localhost:5432/mydb",
    table_name="selectools_documents",
)

store.add_documents([
    Document(text="pgvector adds vector types to Postgres."),
    Document(text="It supports cosine, L2, and inner-product distance."),
])

# search() takes a query embedding, not a string — embed the query first
query_vec = embedder.embed_query("postgres vector search")
results = store.search(query_vec, top_k=2)

!!! tip "See Also" - Qdrant - Self-hosted vector database with REST + gRPC - FAISS - In-process vector index, no server required - Sessions - Postgres-backed agent sessions


Install

pip install "selectools[postgres]"

The [postgres] extras already include psycopg2-binary>=2.9.0. You also need the pgvector extension installed in your database:

CREATE EXTENSION IF NOT EXISTS vector;

Constructor

PgVectorStore(
    embedder: EmbeddingProvider,
    connection_string: str,
    table_name: str = "selectools_documents",
    dimensions: int | None = None,
)
ParameterDescription
embedderEmbedding provider used to compute vectors.
connection_stringStandard libpq connection string.
table_nameTable to store documents in. Validated as a SQL identifier (letters, digits, underscores) to prevent injection.
dimensionsVector dimensions. Auto-detected from embedder.embed_query("test") on first use if not specified.

Schema

PgVectorStore creates the following table on first use (idempotent):

CREATE TABLE IF NOT EXISTS selectools_documents (
    id        TEXT PRIMARY KEY,
    text      TEXT NOT NULL,
    metadata  JSONB,
    embedding vector(N)
);

The N is the embedding dimension. An index on the embedding column accelerates cosine similarity queries.


search() runs a parameterized query using pgvector's <=> cosine distance operator:

SELECT id, text, metadata, embedding <=> %s AS distance
FROM selectools_documents
ORDER BY distance ASC
LIMIT %s;

All queries are parameterized — there's no SQL injection risk from user input.


Connection Pooling

PgVectorStore opens a single psycopg2.connect() per instance. If you need pooling for high concurrency, manage it externally (e.g. PgBouncer) and pass the pooler URL as the connection string.


API Reference

MethodDescription
add_documents(docs)Embed and upsert documents (INSERT ... ON CONFLICT DO UPDATE)
search(query, top_k)Cosine similarity search
delete(ids)Delete documents by ID
clear()TRUNCATE the table

#ScriptDescription
7979_pgvector_store.pypgvector quickstart with auto-table creation