Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

pgvector Tutorial: Turn PostgreSQL into a Vector Database

In 2026, the developer community has embraced a clear mantra: "Just Use Postgres". Snowflake acquired Crunchy Data for $250 million, Databricks invested $1 billion in Neon, and Supabase reached a $5 billion valuation. PostgreSQL is no longer just a relational database: with pgvector, it becomes a full-fledged vector database, capable of handling embeddings, similarity search, and AI workloads without adding a single service to your stack.

If you are already running PostgreSQL (and statistically, you are), you do not need Pinecone, Qdrant, or Weaviate to get started with vector search. In this article, we will build a complete vector search system from scratch: from installing pgvector to creating optimized indexes, through embedding generation and LangChain integration.

Series Overview

#	Article	Focus
1	You are here - pgvector	Installation, operators, indexing
2	Embeddings Deep Dive	Models, chunking, dimensions
3	RAG with PostgreSQL	End-to-end RAG pipeline
4	Hybrid Search	Vectors + full-text search
5	Scaling pgvector	Partitioning, sharding, performance
6	pgvector in Production	Monitoring, backup, CI/CD

What You Will Learn

What pgvector is and why it is the most important PostgreSQL extension in 2026
How to install pgvector on Linux, Docker, and managed services (Supabase, Neon, AWS RDS)
How to create tables with vector columns and insert embeddings
The three distance operators: Cosine, L2, and Inner Product
How to run effective similarity search queries
How to generate embeddings with Python (OpenAI and Sentence Transformers)
HNSW vs IVFFlat: which index to choose and how to configure it
pgvector benchmarks against dedicated vector databases
A complete end-to-end example with Python
LangChain integration for RAG systems

1. The "Just Use Postgres" Trend in 2026

For years, the AI ecosystem pushed toward dedicated vector databases: Pinecone, Weaviate, Qdrant, Milvus. Every RAG project seemed to require a new service to manage, monitor, and pay for. But in 2025-2026, something changed dramatically.

The community realized that vectors are not a database type but a data type. Just as PostgreSQL handles JSON, arrays, geometries, and full-text search, it can also handle high-dimensional vectors. The extension that makes all of this possible is called pgvector.

      Why "Just Use Postgres" in 2026
      
            Factor
            Dedicated Vector Database
            PostgreSQL + pgvector
          
            Infrastructure
            New service to manage
            Extension in your existing DB
          
            Cost
            Separate plan (Pinecone from $70/month)
            Included in PostgreSQL cost
          
            Data consistency
            Cross-DB synchronization needed
            Native ACID transactions
          
            Queries
            Proprietary API
            Standard SQL + JOIN, WHERE, GROUP BY
          
            Backup
            Separate system
            pg_dump includes everything
          
            Team
            Specialized skills required
            Anyone who knows SQL

The key advantage is operational simplicity. With pgvector, your embeddings live in the same table as your relational data. You can JOIN vectors with metadata, use WHERE clauses to filter, and have transactional consistency. No external service to synchronize, no different API to learn.

2. What Is pgvector

pgvector is an open-source extension for PostgreSQL that adds native support for high-dimensional vectors and similarity search. Created by Andrew Kane in 2021, it has quickly become the most downloaded PostgreSQL extension for AI/ML use cases.

With pgvector you can:

Store vectors (embeddings) as a native data type in a table column
Run similarity searches using three different distance metrics
Create specialized indexes (HNSW and IVFFlat) for fast lookups
Combine vector search with everything PostgreSQL offers: JOINs, transactions, triggers, materialized views

      pgvector by the Numbers (2026)
      
            Metric
            Value
          
        Current version0.8.0+ (with iterative index scans)
Max vector dimensions16,000
Index typesHNSW, IVFFlat
Vector typesvector, halfvec, sparsevec, bit
Distance metricsCosine, L2 (Euclidean), Inner Product
Supported byAWS RDS, Supabase, Neon, Google Cloud SQL, Azure
LicensePostgreSQL License (open-source)

Version 0.8.0, released in 2025, introduced a key feature: iterative index scans. This solves the overfiltering problem, where a vector index fails to return enough results because most candidates are discarded by WHERE filters. With iterative scans, pgvector continues exploring the index until it finds enough results that satisfy the filters, dramatically improving filtered query quality.

3. Installing pgvector

pgvector can be installed in several ways depending on your environment. Let us walk through the three main approaches: native Linux installation, Docker, and managed services.

3.1 Installation on Ubuntu/Debian

Installing pgvector on Linux


# Prerequisites: PostgreSQL 14+ already installed
sudo apt-get update
sudo apt-get install -y postgresql-server-dev-16

# Clone and build pgvector
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install

# Enable the extension in your database
sudo -u postgres psql -d your_database -c "CREATE EXTENSION vector;"

For PostgreSQL 17, replace postgresql-server-dev-16 with postgresql-server-dev-17. pgvector supports PostgreSQL versions 13 through 17.

3.2 Installation with Docker

The fastest way to get started. The official pgvector/pgvector image includes PostgreSQL with pgvector pre-installed.

Docker Compose for pgvector


# docker-compose.yml
version: '3.8'
services:
  postgres:
    image: pgvector/pgvector:pg17
    environment:
      POSTGRES_DB: vectordb
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: secret_password
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

volumes:
  pgdata:

init.sql - Enable pgvector on startup


-- init.sql: automatically executed on first startup
CREATE EXTENSION IF NOT EXISTS vector;

Quick start with a single Docker command


# Quick start without docker-compose
docker run -d \
  --name pgvector-db \
  -e POSTGRES_DB=vectordb \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  pgvector/pgvector:pg17

# Connect and enable pgvector
docker exec -it pgvector-db psql -U postgres -d vectordb \
  -c "CREATE EXTENSION vector;"

3.3 Managed Services (Zero Setup)

If you do not want to manage infrastructure, several cloud providers offer PostgreSQL with pgvector already enabled. This is the ideal choice for production deployments.

      Managed Services with pgvector
      
        
            Service
            pgvector Included
            Free Tier
            Notes
          

        
            Supabase
            Pre-installed
            Yes (500 MB)
            Dashboard UI, automatic REST APIs
          

            Neon
            Pre-installed
            Yes (0.5 GB)
            Serverless, database branching
          

            AWS RDS
            Must enable
            No
            pgvector 0.8.0 on Aurora PostgreSQL
          

            Google Cloud SQL
            Must enable
            No
            pgvector 0.8.0 support
          

            Azure Database
            Must enable
            No
            Flexible Server with pgvector
          

      
    

Enabling pgvector on Supabase


-- On Supabase, pgvector is already available.
-- Just enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;

-- Verify the installation
SELECT extversion FROM pg_extension WHERE extname = 'vector';
-- Result: 0.8.0

4. Creating a Table with Vectors

With pgvector installed, you can add vector(n) columns to your tables, where n is the vector dimensionality. The value of n depends on the embedding model you use: OpenAI text-embedding-3-small produces 1536-dimensional vectors, while text-embedding-3-large goes up to 3072.

Creating a documents table with embedding column


-- Enable the extension (if not already done)
CREATE EXTENSION IF NOT EXISTS vector;

-- Documents table for a RAG system
CREATE TABLE documents (
    id BIGSERIAL PRIMARY KEY,
    title VARCHAR(500) NOT NULL,
    content TEXT NOT NULL,
    source VARCHAR(255),
    category VARCHAR(100),
    embedding vector(1536),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- Add a column comment for documentation
COMMENT ON COLUMN documents.embedding IS
    'Embedding generated with OpenAI text-embedding-3-small (1536 dim)';

Inserting Documents with Embeddings

Insertion works like a regular INSERT, passing the vector as a comma-separated string of values within square brackets.

Inserting embeddings into the table


-- Single insert
INSERT INTO documents (title, content, source, embedding)
VALUES (
    'Introduction to PostgreSQL',
    'PostgreSQL is a powerful and reliable open-source relational database...',
    'docs/postgres-intro.md',
    '[0.023, -0.045, 0.067, 0.089, -0.012, ...]'  -- 1536-dimensional vector
);

-- Batch insert
INSERT INTO documents (title, content, source, category, embedding)
VALUES
    ('Chapter 1', 'Chapter 1 content...', 'book.pdf', 'tutorial',
     '[0.11, -0.22, 0.33, ...]'),
    ('Chapter 2', 'Chapter 2 content...', 'book.pdf', 'tutorial',
     '[0.44, -0.55, 0.66, ...]'),
    ('FAQ', 'Frequently asked questions...', 'faq.md', 'support',
     '[-0.12, 0.34, -0.56, ...]');

Vector Dimensionality

The dimensionality specified in the column definition (vector(1536)) must exactly match the dimension of the embeddings you insert. Attempting to insert a 768-dimensional vector into a vector(1536) column will raise an error. Choose the dimension based on the embedding model you intend to use.

      Popular Embedding Model Dimensions
      
        
            Model
            Provider
            Dimensions
            Cost / 1M tokens
          

        
            text-embedding-3-small
            OpenAI
            1536
            $0.02
          

            text-embedding-3-large
            OpenAI
            3072
            $0.13
          

            all-MiniLM-L6-v2
            HuggingFace (free)
            384
            Free (local)
          

            all-mpnet-base-v2
            HuggingFace (free)
            768
            Free (local)
          

            voyage-3
            Voyage AI
            1024
            $0.06
          

      
    

5. Distance Operators

pgvector provides three operators for computing distance between vectors. Each operator corresponds to a different mathematical metric and has specific use cases.

      The Three pgvector Operators
      
        
            Operator
            Metric
            Formula
            Ordering
            Use Case
          

        
            <=>
            Cosine Distance
            1 - cosine_similarity
            ASC (0 = identical)
            Text, normalized embeddings
          

            <->
            L2 (Euclidean)
            sqrt(sum((a-b)^2))
            ASC (0 = identical)
            Images, spatial data
          

            <#>
            Inner Product (negated)
            -1 * dot_product
            ASC (more negative = more similar)
            Normalized vectors, max performance
          

      
    

Comparing operators in practice


-- COSINE DISTANCE <=> (most commonly used for text)
-- Result: 0 = identical, 2 = opposite
SELECT title, content,
       embedding <=> '[0.1, 0.2, ...]' AS cosine_distance
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]' ASC
LIMIT 5;

-- L2 (EUCLIDEAN) <->
-- Result: 0 = identical, increasing values = farther apart
SELECT title, content,
       embedding <-> '[0.1, 0.2, ...]' AS l2_distance
FROM documents
ORDER BY embedding <-> '[0.1, 0.2, ...]' ASC
LIMIT 5;

-- INNER PRODUCT (NEGATED) <#>
-- Result: more negative values = more similar
SELECT title, content,
       (embedding <#> '[0.1, 0.2, ...]') * -1 AS similarity
FROM documents
ORDER BY embedding <#> '[0.1, 0.2, ...]' ASC
LIMIT 5;

Which Operator to Choose

For 90% of text and AI use cases, use <=> (cosine distance). Embedding models like those from OpenAI produce normalized vectors, making cosine distance and inner product equivalent in terms of results. Use <-> (L2) only when working with spatial data or images where vector magnitude matters. Inner product (<#>) is slightly faster than cosine because it skips normalization, but only on already-normalized vectors.

6. Similarity Search Queries

Similarity search is the core of pgvector. The basic query is straightforward: order results by distance from the query vector and take the top K. But in production, queries become more sophisticated by combining vector search with SQL filters.

6.1 Basic Query

Fundamental similarity search


-- Find the 5 most similar documents to a query vector
SELECT
    id,
    title,
    content,
    1 - (embedding <=> $1) AS similarity  -- $1 = query vector
FROM documents
ORDER BY embedding <=> $1
LIMIT 5;

6.2 Queries with Filters (pgvector's Real Power)

This is where pgvector shines compared to dedicated vector databases. You can combine vector search with any SQL clause: WHERE, JOIN, GROUP BY, subqueries.

Similarity search with advanced filters


-- Filter by category and date
SELECT
    id, title, content,
    1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'tutorial'
  AND created_at > NOW() - INTERVAL '30 days'
ORDER BY embedding <=> $1
LIMIT 5;

-- JOIN with other tables: find similar docs with author info
SELECT
    d.title,
    d.content,
    a.name AS author_name,
    1 - (d.embedding <=> $1) AS similarity
FROM documents d
JOIN authors a ON d.author_id = a.id
WHERE a.verified = true
ORDER BY d.embedding <=> $1
LIMIT 10;

-- Exclusion: find similar but NOT the document itself
SELECT
    id, title,
    1 - (embedding <=> (SELECT embedding FROM documents WHERE id = 42)) AS similarity
FROM documents
WHERE id != 42
ORDER BY embedding <=> (SELECT embedding FROM documents WHERE id = 42)
LIMIT 5;

6.3 Iterative Index Scans (pgvector 0.8.0)

One of the most important additions in version 0.8.0 is support for iterative index scans. When you combine vector search with restrictive WHERE filters, the index might not find enough results on the first pass. With iterative scans, pgvector continues exploring the index until it satisfies the requested LIMIT.

Enabling iterative index scans


-- Enable iterative scan for HNSW
SET hnsw.iterative_scan = relaxed_order;  -- or strict_order

-- Set the maximum number of tuples to scan
SET hnsw.max_scan_tuples = 20000;

-- Now queries with restrictive filters work correctly
SELECT id, title,
       1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'rare'  -- category with few documents
ORDER BY embedding <=> $1
LIMIT 10;
-- Without iterative scan: might return 0-3 results
-- With iterative scan: always returns up to 10 results

-- Available modes:
-- off           = traditional behavior (default)
-- strict_order  = maintains exact distance ordering
-- relaxed_order = approximate ordering (better performance)

When to Use Iterative Scans

relaxed_order: for most cases, offers the best balance between performance and quality
strict_order: when exact distance ordering is critical (e.g., ranking results for users)
off: when you have no WHERE filters or when filters are not restrictive

7. Generating Embeddings with Python

To insert data into pgvector, you first need to transform text into numerical vectors using an embedding model. Let us look at two approaches: the OpenAI API (paid, high quality) and Sentence Transformers (free, local).

7.1 With the OpenAI API

Generating embeddings with OpenAI and inserting into PostgreSQL


import openai
import psycopg2
from psycopg2.extras import execute_values

# Configure the OpenAI client
client = openai.OpenAI(api_key="sk-...")

def get_embedding(text: str, model: str = "text-embedding-3-small") -> list[float]:
    """Generate an embedding for a text using OpenAI."""
    response = client.embeddings.create(
        input=text,
        model=model
    )
    return response.data[0].embedding  # list of 1536 floats

# Connect to PostgreSQL
conn = psycopg2.connect(
    host="localhost",
    database="vectordb",
    user="admin",
    password="secret_password"
)
cur = conn.cursor()

# Generate and insert embeddings
documents = [
    ("Introduction to Python", "Python is a versatile programming language..."),
    ("Advanced PostgreSQL", "Recursive CTEs allow hierarchical queries..."),
    ("Docker for Developers", "Docker containers isolate applications..."),
]

for title, content in documents:
    embedding = get_embedding(content)
    cur.execute(
        "INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
        (title, content, str(embedding))
    )

conn.commit()
cur.close()
conn.close()

7.2 With Sentence Transformers (Free and Local)

Free embeddings with Sentence Transformers


from sentence_transformers import SentenceTransformer
import psycopg2
import numpy as np

# Load the model (automatically downloaded the first time)
model = SentenceTransformer('all-MiniLM-L6-v2')  # 384 dimensions

# Generate embeddings for a batch of texts
texts = [
    "Python is a versatile programming language",
    "Recursive CTEs enable hierarchical queries in SQL",
    "Docker containers isolate applications",
]
embeddings = model.encode(texts)  # shape: (3, 384)

# Connect and insert
conn = psycopg2.connect(
    host="localhost", database="vectordb",
    user="admin", password="secret_password"
)
cur = conn.cursor()

# NOTE: the column must be vector(384), not vector(1536)
for text, embedding in zip(texts, embeddings):
    cur.execute(
        "INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
        (text[:100], text, str(embedding.tolist()))
    )

conn.commit()
cur.close()
conn.close()

      OpenAI vs Sentence Transformers
      
            Aspect
            OpenAI API
            Sentence Transformers
          
            Cost
            $0.02 / 1M tokens
            Free
          
            Quality
            Excellent (state of the art)
            Good (sufficient for many use cases)
          
            Privacy
            Data sent to OpenAI
            Fully local, no data leaves your server
          
            Speed
            ~100ms per request (network)
            ~10ms per text (with GPU)
          
            Offline
            No (requires internet)
            Yes (after first download)
          
            Dimensions
            1536 or 3072
            384 or 768 (model-dependent)

The Golden Rule of Embeddings

Always use the same embedding model for both indexing documents and generating the query vector. Embeddings from different models live in different vector spaces and are not comparable. If you index with text-embedding-3-small, you must search with the same model.

8. Indexing: HNSW vs IVFFlat

Without an index, pgvector performs a sequential scan (brute force), comparing the query vector with every row in the table. This is 100% accurate but becomes slow with large datasets. ANN (Approximate Nearest Neighbor) indexes sacrifice a small percentage of accuracy for searches that are orders of magnitude faster.

8.1 HNSW Index

HNSW (Hierarchical Navigable Small World) builds a multi-layered graph of connections between vectors. It is the recommended index for most use cases: fast reads, good accuracy, and it supports incremental inserts without rebuilding.

Creating and configuring an HNSW index


-- HNSW index with default parameters
CREATE INDEX idx_documents_embedding_hnsw
ON documents
USING hnsw (embedding vector_cosine_ops);

-- HNSW index with custom parameters
CREATE INDEX idx_documents_embedding_hnsw
ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (
    m = 24,                -- connections per node (default: 16, range: 2-100)
    ef_construction = 128  -- candidates during build (default: 64, range: 4-1000)
);

-- Tune ef_search for queries (default: 40)
SET hnsw.ef_search = 100;  -- higher = more accurate but slower

      HNSW Parameters: Configuration Guide
      
        
            Parameter
            Default
            Range
            Effect
            Recommendation
          

        
            m
            16
            2 - 100
            More connections = more accurate, more memory
            16-32 for most cases
          

            ef_construction
            64
            4 - 1000
            Higher = better index, slower build
            64-256 (one-time cost, worth increasing)
          

            ef_search
            40
            1 - 1000
            Higher = more accurate search, slower
            40-200 based on desired trade-off
          

      
    

8.2 IVFFlat Index

IVFFlat (Inverted File with Flat compression) divides vectors into clusters and searches only the clusters closest to the query. It builds faster than HNSW but requires rebuilding when you add significant new data. Ideal for datasets that change infrequently.

Creating and configuring an IVFFlat index


-- IMPORTANT: IVFFlat requires data in the table BEFORE creating the index
-- (it needs to perform k-means clustering on existing data)

-- IVFFlat index
CREATE INDEX idx_documents_embedding_ivf
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (
    lists = 100   -- number of clusters (default: 100)
);

-- Tune the number of clusters to explore during search
SET ivfflat.probes = 10;  -- default: 1, recommended: sqrt(lists)

-- Rule of thumb for "lists":
-- rows < 1M:    lists = rows / 1000
-- rows >= 1M:   lists = sqrt(rows)

8.3 Operator to Index Class Mapping

The operator specified when creating the index must match the operator used in your queries. Here is the complete mapping.

      Operator and Index Class Mapping
      
            Distance
            SQL Operator
            Index Class (ops)
          
            Cosine
            <=>
            vector_cosine_ops
          
            L2 (Euclidean)
            <->
            vector_l2_ops
          
            Inner Product
            <#>
            vector_ip_ops

8.4 HNSW vs IVFFlat: Which to Choose

      HNSW vs IVFFlat Comparison
      
            Aspect
            HNSW
            IVFFlat
          
            Search speed
            Faster
            Fast
          
            Accuracy (recall)
            ~95-99%
            ~90-98%
          
            Build time
            Slow
            Fast
          
            Memory
            High
            Medium
          
            Incremental inserts
            Supported
            Requires REINDEX
          
            Iterative scans
            Yes (0.8.0)
            Yes (0.8.0)
          
            Ideal use case
            General purpose, growing data
            Static datasets, fast builds

Practical Advice

For most projects, choose HNSW. It is more versatile, supports incremental inserts without rebuilding, and pgvector 0.8.0 has significantly improved its performance. Use IVFFlat only if index build time is critical (very large datasets loaded once) or if you have strict memory constraints.

9. Performance: pgvector vs Dedicated Vector Databases

The most common question is: "Is pgvector fast enough for production?" The answer in 2026 is a clear yes, for the vast majority of use cases. Let us look at the numbers.

9.1 Large-Scale Benchmarks

Benchmarks from pgvectorscale (a complementary extension developed by Timescale) on 50 million vectors show impressive results:

      Benchmark: 50M Vectors at 99% Recall
      
            System
            QPS (Queries/sec)
            Recall
            Notes
          
            pgvectorscale
            471
            99%
            11.4x faster than Qdrant
          
            Qdrant
            41
            99%
            Optimized for pure vector search

9.2 Mid-Scale Benchmarks (1M Vectors)

      Benchmark: 1M Vectors - Throughput
      
            System
            Queries/sec
            Inserts/sec
            Managed Cost
          
            Pinecone
            ~5,000
            ~50,000
            From $70/month
          
            Qdrant
            ~4,500
            ~45,000
            Self-hosted or cloud
          
            pgvector
            ~3,000
            ~30,000
            Included in PG cost

At 1M vectors, pgvector is slightly slower in raw throughput. But considering that the cost is virtually zero (already included in your PostgreSQL), the value-for-money ratio is unbeatable. For most web applications, 3,000 queries per second is more than enough.

When pgvector Is Not Enough

Consider a dedicated vector database (Qdrant, Pinecone) when:

You have over 50 million vectors
You need p99 latencies under 5ms
Your workload is exclusively vector search (no relational data)
You need native distributed sharding

For everything else, pgvector is the right choice. A system under 10 million vectors with PostgreSQL already in the stack does not need additional complexity.

10. Complete Example: Knowledge Base with Python

Let us put everything together in a practical example. We will build a knowledge base that indexes documents with embeddings, enables similarity search, and supports filtering by category.

knowledge_base.py - Complete example


"""
Knowledge Base with PostgreSQL + pgvector
Requirements: pip install openai psycopg2-binary
"""
import openai
import psycopg2
from dataclasses import dataclass

# === Configuration ===

OPENAI_API_KEY = "sk-..."
DB_CONFIG = {
    "host": "localhost",
    "database": "vectordb",
    "user": "admin",
    "password": "secret_password",
}
EMBEDDING_MODEL = "text-embedding-3-small"
EMBEDDING_DIM = 1536


@dataclass(frozen=True)
class SearchResult:
    id: int
    title: str
    content: str
    similarity: float
    category: str


# === Embedding functions ===

client = openai.OpenAI(api_key=OPENAI_API_KEY)

def get_embedding(text: str) -> list[float]:
    """Generate an embedding for a text."""
    response = client.embeddings.create(
        input=text,
        model=EMBEDDING_MODEL,
    )
    return response.data[0].embedding


def get_embeddings_batch(texts: list[str]) -> list[list[float]]:
    """Generate embeddings for a batch of texts (more efficient)."""
    response = client.embeddings.create(
        input=texts,
        model=EMBEDDING_MODEL,
    )
    return [item.embedding for item in response.data]


# === Database setup ===

def setup_database() -> None:
    """Create the table and indexes."""
    conn = psycopg2.connect(**DB_CONFIG)
    cur = conn.cursor()

    cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")

    cur.execute(f"""
        CREATE TABLE IF NOT EXISTS kb_documents (
            id BIGSERIAL PRIMARY KEY,
            title VARCHAR(500) NOT NULL,
            content TEXT NOT NULL,
            category VARCHAR(100) DEFAULT 'general',
            embedding vector({EMBEDDING_DIM}),
            created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
        );
    """)

    cur.execute("""
        CREATE INDEX IF NOT EXISTS idx_kb_embedding_hnsw
        ON kb_documents
        USING hnsw (embedding vector_cosine_ops)
        WITH (m = 24, ef_construction = 128);
    """)

    cur.execute("""
        CREATE INDEX IF NOT EXISTS idx_kb_category
        ON kb_documents (category);
    """)

    conn.commit()
    cur.close()
    conn.close()
    print("Database setup complete.")


# === Document insertion ===

def insert_documents(
    documents: list[dict],
    batch_size: int = 50,
) -> None:
    """Insert documents with automatically generated embeddings."""
    conn = psycopg2.connect(**DB_CONFIG)
    cur = conn.cursor()

    for i in range(0, len(documents), batch_size):
        batch = documents[i : i + batch_size]
        texts = [doc["content"] for doc in batch]
        embeddings = get_embeddings_batch(texts)

        for doc, emb in zip(batch, embeddings):
            cur.execute(
                """INSERT INTO kb_documents (title, content, category, embedding)
                   VALUES (%s, %s, %s, %s)""",
                (doc["title"], doc["content"],
                 doc.get("category", "general"), str(emb)),
            )

        print(f"Inserted {min(i + batch_size, len(documents))}/{len(documents)} documents")

    conn.commit()
    cur.close()
    conn.close()


# === Similarity search ===

def search(
    query: str,
    limit: int = 5,
    category: str | None = None,
    min_similarity: float = 0.3,
) -> list[SearchResult]:
    """Search for documents similar to the query."""
    query_embedding = get_embedding(query)

    conn = psycopg2.connect(**DB_CONFIG)
    cur = conn.cursor()

    # Enable iterative scan for restrictive filters
    cur.execute("SET hnsw.iterative_scan = relaxed_order;")

    if category:
        cur.execute(
            """SELECT id, title, content, category,
                      1 - (embedding <=> %s::vector) AS similarity
               FROM kb_documents
               WHERE category = %s
               ORDER BY embedding <=> %s::vector
               LIMIT %s""",
            (str(query_embedding), category,
             str(query_embedding), limit),
        )
    else:
        cur.execute(
            """SELECT id, title, content, category,
                      1 - (embedding <=> %s::vector) AS similarity
               FROM kb_documents
               ORDER BY embedding <=> %s::vector
               LIMIT %s""",
            (str(query_embedding), str(query_embedding), limit),
        )

    results = [
        SearchResult(
            id=row[0], title=row[1], content=row[2],
            category=row[3], similarity=row[4],
        )
        for row in cur.fetchall()
        if row[4] >= min_similarity
    ]

    cur.close()
    conn.close()
    return results


# === Main ===

if __name__ == "__main__":
    # 1. Setup
    setup_database()

    # 2. Insert sample documents
    sample_docs = [
        {
            "title": "Introduction to PostgreSQL",
            "content": "PostgreSQL is an open-source RDBMS with full ACID support.",
            "category": "database",
        },
        {
            "title": "PostgreSQL Indexes",
            "content": "B-trees are the default index type. HNSW is used for vectors.",
            "category": "database",
        },
        {
            "title": "What is RAG",
            "content": "RAG combines retrieval and generation for document-based answers.",
            "category": "ai",
        },
        {
            "title": "Docker for Developers",
            "content": "Docker isolates applications in lightweight, portable containers.",
            "category": "devops",
        },
        {
            "title": "Embeddings and Vector Search",
            "content": "Embeddings transform text into vectors for similarity search.",
            "category": "ai",
        },
    ]
    insert_documents(sample_docs)

    # 3. Search documents
    print("\n--- Search: 'how do vector databases work' ---")
    results = search("how do vector databases work", limit=3)
    for r in results:
        print(f"  [{r.similarity:.3f}] {r.title} ({r.category})")

    print("\n--- Filtered search for category 'ai' ---")
    results = search("PostgreSQL tutorial", limit=3, category="ai")
    for r in results:
        print(f"  [{r.similarity:.3f}] {r.title} ({r.category})")

11. LangChain Integration

If you are building a RAG system, LangChain provides a direct integration with pgvector through the langchain-postgres package. This lets you use PostgreSQL as a vector store without writing SQL manually.

Installing dependencies


pip install langchain-postgres langchain-openai psycopg

pgvector as a Vector Store in LangChain


from langchain_postgres import PGVector
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

# Configure the embedding model
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    openai_api_key="sk-...",
)

# NOTE: langchain-postgres uses psycopg3 (not psycopg2)
CONNECTION_STRING = "postgresql+psycopg://admin:secret@localhost:5432/vectordb"

# Create the vector store
vector_store = PGVector(
    embeddings=embeddings,
    collection_name="langchain_docs",
    connection=CONNECTION_STRING,
    use_jsonb=True,  # JSONB metadata for efficient filtering
)

# Add documents
docs = [
    Document(
        page_content="pgvector turns PostgreSQL into a vector database",
        metadata={"source": "tutorial.md", "chapter": 1},
    ),
    Document(
        page_content="HNSW is the recommended index for most use cases",
        metadata={"source": "tutorial.md", "chapter": 2},
    ),
    Document(
        page_content="LangChain simplifies building RAG systems",
        metadata={"source": "rag-guide.md", "chapter": 1},
    ),
]

vector_store.add_documents(docs)

# Similarity search
results = vector_store.similarity_search(
    "how to create a vector index",
    k=3,
)
for doc in results:
    print(f"[{doc.metadata['source']}] {doc.page_content[:80]}...")

# Search with scores
results_with_scores = vector_store.similarity_search_with_score(
    "what is pgvector",
    k=3,
)
for doc, score in results_with_scores:
    print(f"[Score: {score:.4f}] {doc.page_content[:80]}...")

# Use as a retriever in a RAG chain
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 5},
)

# The retriever can be used directly in LangChain chains
relevant_docs = retriever.invoke("install pgvector docker")

Note on psycopg3

The langchain-postgres package requires psycopg3 (not psycopg2). Connection strings must use the postgresql+psycopg:// format instead of postgresql+psycopg2://. If you are migrating from langchain-community, update your connection string accordingly.

12. Best Practices for pgvector in Production

12.1 Vector Dimensions

Smaller vectors take less space, make indexes faster, and reduce costs. OpenAI text-embedding-3-small (1536 dim) offers the best quality-to-size ratio for most use cases. If you can tolerate a slight quality loss, all-MiniLM-L6-v2 (384 dim) is an excellent free alternative.

12.2 Batch Inserts

Optimized batch insertion


from psycopg2.extras import execute_values

# WRONG: one INSERT at a time (slow)
for doc in documents:
    cur.execute("INSERT INTO docs (content, embedding) VALUES (%s, %s)",
                (doc.content, str(doc.embedding)))

# CORRECT: batch insert with execute_values (10-50x faster)
data = [
    (doc.content, str(doc.embedding))
    for doc in documents
]
execute_values(
    cur,
    "INSERT INTO docs (content, embedding) VALUES %s",
    data,
    page_size=500,  # rows per SQL batch
)
conn.commit()

12.3 Connection Pooling

In production, do not open a new PostgreSQL connection for every request. Use a connection pool such as PgBouncer or the built-in pooling from psycopg2.

Connection pooling with psycopg2


from psycopg2 import pool

# Create a connection pool (at application startup)
connection_pool = pool.ThreadedConnectionPool(
    minconn=5,    # minimum connections
    maxconn=20,   # maximum connections
    host="localhost",
    database="vectordb",
    user="admin",
    password="secret_password",
)

# Use a connection from the pool
conn = connection_pool.getconn()
try:
    cur = conn.cursor()
    cur.execute("SELECT ... ORDER BY embedding <=> %s LIMIT 5", ...)
    results = cur.fetchall()
    cur.close()
finally:
    connection_pool.putconn(conn)  # return to pool

12.4 Maintenance

Periodic maintenance operations


-- VACUUM to reclaim space after DELETE/UPDATE
VACUUM ANALYZE documents;

-- Rebuild IVFFlat index after massive inserts
REINDEX INDEX CONCURRENTLY idx_documents_embedding_ivf;
-- NOTE: HNSW does not require REINDEX for new inserts

-- Monitor index size
SELECT
    indexname,
    pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes
WHERE tablename = 'documents';

-- Verify that the index is being used
EXPLAIN ANALYZE
SELECT id, 1 - (embedding <=> '[0.1, 0.2, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 5;
-- Look for "Index Scan using idx_..._hnsw" in the output

Production Checklist

Create an HNSW index with m = 24 and ef_construction = 128
Set ef_search between 40 and 200 based on the latency/accuracy trade-off
Enable hnsw.iterative_scan = relaxed_order if you use WHERE filters
Use batch inserts with execute_values for bulk loads
Configure a connection pool (PgBouncer or psycopg2.pool)
Schedule periodic VACUUM ANALYZE
Monitor queries with EXPLAIN ANALYZE to verify index usage
Use halfvec instead of vector to save 50% space (if float16 precision is sufficient)

Conclusions

pgvector has transformed PostgreSQL from a purely relational database into an AI-ready platform. With version 0.8.0 and its iterative index scans, the gap with dedicated vector databases has narrowed dramatically, and for most projects under 10 million vectors, there is no reason to add a separate service to your stack.

In this article, you learned how to install pgvector, create tables with vector columns, run similarity searches with three distance operators, generate embeddings with Python, configure HNSW and IVFFlat indexes, and integrate everything with LangChain. You also saw benchmarks confirming pgvector's production-grade performance.

In the next article, we will dive deep into embeddings: how they work internally, how to choose the right model, chunking strategies for long documents, and how to optimize search quality. In the third article, we will build a complete RAG pipeline with PostgreSQL as the sole backend.

Additional Resources

pgvector GitHub: github.com/pgvector/pgvector - Official documentation and changelog
pgvectorscale: github.com/timescale/pgvectorscale - Extension for advanced performance
LangChain PGVector: python.langchain.com - Official Python integration
Supabase Vector: supabase.com/docs/guides/ai - pgvector guide on Supabase
Neon AI: neon.tech/docs/ai - pgvector on Neon serverless

Factor	Dedicated Vector Database	PostgreSQL + pgvector
Infrastructure	New service to manage	Extension in your existing DB
Cost	Separate plan (Pinecone from $70/month)	Included in PostgreSQL cost
Data consistency	Cross-DB synchronization needed	Native ACID transactions
Queries	Proprietary API	Standard SQL + JOIN, WHERE, GROUP BY
Backup	Separate system	pg_dump includes everything
Team	Specialized skills required	Anyone who knows SQL

Metric	Value
Current version	0.8.0+ (with iterative index scans)
Max vector dimensions	16,000
Index types	HNSW, IVFFlat
Vector types	vector, halfvec, sparsevec, bit
Distance metrics	Cosine, L2 (Euclidean), Inner Product
Supported by	AWS RDS, Supabase, Neon, Google Cloud SQL, Azure
License	PostgreSQL License (open-source)

Service	pgvector Included	Free Tier	Notes
Supabase	Pre-installed	Yes (500 MB)	Dashboard UI, automatic REST APIs
Neon	Pre-installed	Yes (0.5 GB)	Serverless, database branching
AWS RDS	Must enable	No	pgvector 0.8.0 on Aurora PostgreSQL
Google Cloud SQL	Must enable	No	pgvector 0.8.0 support
Azure Database	Must enable	No	Flexible Server with pgvector

Model	Provider	Dimensions	Cost / 1M tokens
text-embedding-3-small	OpenAI	1536	$0.02
text-embedding-3-large	OpenAI	3072	$0.13
all-MiniLM-L6-v2	HuggingFace (free)	384	Free (local)
all-mpnet-base-v2	HuggingFace (free)	768	Free (local)
voyage-3	Voyage AI	1024	$0.06

Operator	Metric	Formula	Ordering	Use Case
`<=>`	Cosine Distance	1 - cosine_similarity	ASC (0 = identical)	Text, normalized embeddings
`<->`	L2 (Euclidean)	sqrt(sum((a-b)^2))	ASC (0 = identical)	Images, spatial data
`<#>`	Inner Product (negated)	-1 * dot_product	ASC (more negative = more similar)	Normalized vectors, max performance

Aspect	OpenAI API	Sentence Transformers
Cost	$0.02 / 1M tokens	Free
Quality	Excellent (state of the art)	Good (sufficient for many use cases)
Privacy	Data sent to OpenAI	Fully local, no data leaves your server
Speed	~100ms per request (network)	~10ms per text (with GPU)
Offline	No (requires internet)	Yes (after first download)
Dimensions	1536 or 3072	384 or 768 (model-dependent)

Parameter	Default	Range	Effect	Recommendation
`m`	16	2 - 100	More connections = more accurate, more memory	16-32 for most cases
`ef_construction`	64	4 - 1000	Higher = better index, slower build	64-256 (one-time cost, worth increasing)
`ef_search`	40	1 - 1000	Higher = more accurate search, slower	40-200 based on desired trade-off

Distance	SQL Operator	Index Class (ops)
Cosine	`<=>`	`vector_cosine_ops`
L2 (Euclidean)	`<->`	`vector_l2_ops`
Inner Product	`<#>`	`vector_ip_ops`

Aspect	HNSW	IVFFlat
Search speed	Faster	Fast
Accuracy (recall)	~95-99%	~90-98%
Build time	Slow	Fast
Memory	High	Medium
Incremental inserts	Supported	Requires REINDEX
Iterative scans	Yes (0.8.0)	Yes (0.8.0)
Ideal use case	General purpose, growing data	Static datasets, fast builds

System	QPS (Queries/sec)	Recall	Notes
pgvectorscale	471	99%	11.4x faster than Qdrant
Qdrant	41	99%	Optimized for pure vector search

System	Queries/sec	Inserts/sec	Managed Cost
Pinecone	~5,000	~50,000	From $70/month
Qdrant	~4,500	~45,000	Self-hosted or cloud
pgvector	~3,000	~30,000	Included in PG cost