Quick Start

May 7, 2025 · View on GitHub

pgai pgai

Power your AI applications with PostgreSQL


Use a PostgreSQL extension to make it easier to access AI features from within the database. Supports:

  • Ability to call out to leading LLMs like OpenAI, Ollama, Cohere, and more via SQL.
  • Built-in utilities for dataset loading and processing
  • Retrieval Augmented Generation (RAG) directly in SQL

Docker

See the install via docker guide for docker compose files and detailed container instructions.

Timescale Cloud

Try pgai on cloud by creating a free trial account on Timescale Cloud.

Installing pgai into an existing PostgreSQL instance (Linux / MacOS)

See the install from source guide for instructions on how to install pgai from source.

Quick Start

This section will walk you through the steps to get started with pgai and Ollama using docker and show you the major features of pgai.

Please note that using Ollama requires a large (>4GB) download of the docker image and model. If you don't want to download so much data, you may want to use the OpenAI quick start or VoyageAI quick start instead.

Setup

  1. Download the docker compose file file.

    curl -O https://raw.githubusercontent.com/timescale/pgai/main/examples/docker_compose_pgai_ollama/docker-compose.yml
    
  2. Start the docker compose file.

    docker compose up -d
    

    This will start Ollama and a PostgreSQL instance with the pgai extension installed.

  3. Download the Ollama models. We'll use the all-minilm model for embeddings and the tinyllama model for reasoning.

    docker compose exec ollama ollama pull all-minilm
    docker compose exec ollama ollama pull tinyllama
    
  1. Connect to the database in your local developer environment The easiest way connect to the database is with the following command: docker compose exec -it db psql.

    Alternatively, you can connect to the database with the following connection string: postgres://postgres:postgres@localhost:5432/postgres.

  2. Enable pgai on your database

    CREATE EXTENSION IF NOT EXISTS ai CASCADE;
    
  3. Create a table with the data you want to embed from a huggingface dataset

    We'll create a table named wiki from a few rows of the english-language wikimedia/wikipedia dataset.

    First, we'll create the table:

    CREATE TABLE wiki (
        id      TEXT PRIMARY KEY,
        url     TEXT,
        title   TEXT,
        text    TEXT
    );
    

    Then, we'll load the data from the huggingface dataset:

    SELECT ai.load_dataset('wikimedia/wikipedia', '20231101.en', table_name=>'wiki', batch_size=>5, max_batches=>1, if_table_exists=>'append');
    

    Related documentation: load dataset from huggingface.

  4. Generate a summary of the article in the database

    We'll generate a summary of the search results using the ai.ollama_generate function (this will take a few minutes).

    SELECT answer->>'response' as summary
    FROM ai.ollama_generate('tinyllama', 
    'Summarize the following and output the summary in a single sentence: '|| (SELECT text FROM wiki WHERE title like 'pgai%')) as answer;
    
    Click to see the output
    summary
    Pgai is a tool that simplifies the process of making AI applications easier by providing easy access to data in PostgreSQL and enabling semantic search on the data for the Retrieval Augmented Generation (RAG) pipeline. This allows the AI system to answer questions about unseen data without being trained on it, simplifying the entire process.

    This is just one example of model calling capabilities. Model calling can be used for a variety of tasks, including classification, summarization, moderation, and other forms of data enrichment.

Features

Leverage LLMs for data processing tasks

  • Retrieve LLM chat completions from models like Claude Sonnet 3.5, OpenAI GPT4o, Cohere Command, and Llama 3 (via Ollama). (learn more)
  • Reason over your data and facilitate use cases like classification, summarization, and data enrichment on your existing relational data in PostgreSQL (see an example).

Useful utilities

Resources

Why we built it

Tutorials about pgai model calling

The pgai extension exposes a set of functions to directly interact with the LLM models through SQL, enabling you to do semantic search directly in your database:

SELECT
   chunk,
   embedding <=> ai.ollama_embed(<embedding_model>, 'some-query') as distance
FROM <embedding_table>
ORDER BY distance
LIMIT 5;

Implement Retrieval Augmented Generation inside a single SQL statement

pgai LLM functions enable you to implement RAG directly in your database. For example, if you have a table called blog_embeddings that has chunked, embedded data for your blogs, you can use the following code to perform RAG:

  1. Create a RAG function:

    CREATE OR REPLACE FUNCTION generate_rag_response(query_text TEXT)
    RETURNS TEXT AS $$
    DECLARE
       context_chunks TEXT;
       response TEXT;
    BEGIN
       -- Perform similarity search to find relevant blog posts
       SELECT string_agg(title || ': ' || chunk, E'\n') INTO context_chunks
       FROM
       (
           SELECT title, chunk
           FROM blogs_embedding
           ORDER BY embedding <=> ai.ollama_embed('nomic-embed-text', query_text)
           LIMIT 3
       ) AS relevant_posts;
    
       -- Generate a summary using llama3
       SELECT ai.ollama_chat_complete
       ( 'llama3'
       , jsonb_build_array
         ( jsonb_build_object('role', 'system', 'content', 'you are a helpful assistant')
         , jsonb_build_object
           ('role', 'user'
           , 'content', query_text || E'\nUse the following context to respond.\n' || context_chunks
           )
         )
       )->'message'->>'content' INTO response;
    
       RETURN response;
    END;
    $$ LANGUAGE plpgsql;
    
  2. Execute your function in a SQL query:

    SELECT generate_rag_response('Give me some startup advice');
    

Model calling

Model calling is a feature of pgai that allows you to call LLM models from SQL. This lets you leverage the power of LLMs for a variety of tasks, including classification, summarization, moderation, and other forms of data enrichment.

The following models are supported (click on the model to learn more):

ModelTokenizeEmbedChat CompleteGenerateModerateClassifyRerank
Ollama✔️✔️✔️
OpenAI✔️️✔️✔️✔️
Anthropic✔️
Cohere✔️✔️✔️✔️✔️
Voyage AI✔️
Huggingface (with LiteLLM)✔️
Mistral (with LiteLLM)✔️
Azure OpenAI (with LiteLLM)✔️
AWS Bedrock (with LiteLLM)✔️
Vertex AI (with LiteLLM)✔️

Some examples:

Get involved

pgai is still at an early stage. Now is a great time to help shape the direction of this project; we are currently deciding priorities. Have a look at the list of features we're thinking of working on. Feel free to comment, expand the list, or hop on the Discussions forum.

About Timescale

Timescale is a PostgreSQL database company. To learn more visit the timescale.com.

Timescale Cloud is a high-performance, developer focused, cloud platform that provides PostgreSQL services for the most demanding AI, time-series, analytics, and event workloads. Timescale Cloud is ideal for production applications and provides high availability, streaming backups, upgrades over time, roles and permissions, and great security.