PostgreSQL's built-in full-text search provides powerful text search capabilities without requiring external tools like Elasticsearch. It supports stemming, ranking, highlighting, phrase search, and custom dictionaries. This guide covers setting up full-text search from basic queries to production-ready configurations.
Full-Text Search Fundamentals
PostgreSQL full-text search works with two key data types:
- tsvector — a sorted list of lexemes (normalized words) extracted from a document
- tsquery — a search query composed of lexemes combined with boolean operators
-- Convert text to tsvector
SELECT to_tsvector('english', 'The quick brown foxes jumped over lazy dogs');
-- Result: 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
-- Convert search terms to tsquery
SELECT to_tsquery('english', 'quick & foxes');
-- Result: 'quick' & 'fox'
-- Basic match
SELECT to_tsvector('english', 'The quick brown fox') @@ to_tsquery('english', 'fox');
-- Result: true
Adding Full-Text Search to a Table
Option 1: Generated Column (PostgreSQL 12+, Recommended)
-- Add a tsvector column that auto-updates
ALTER TABLE articles ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;
-- Create GIN index for fast searches
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);
Option 2: Trigger-Based (More Flexible)
-- Add column
ALTER TABLE articles ADD COLUMN search_vector tsvector;
-- Create trigger function
CREATE OR REPLACE FUNCTION articles_search_trigger() RETURNS trigger AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', coalesce(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.summary, '')), 'B') ||
setweight(to_tsvector('english', coalesce(NEW.body, '')), 'C') ||
setweight(to_tsvector('english', coalesce(NEW.tags, '')), 'A');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trig_articles_search
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW
EXECUTE FUNCTION articles_search_trigger();
-- Populate existing rows
UPDATE articles SET search_vector =
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B');
-- Create index
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);
Search Queries
Basic Search
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgresql & replication') AS query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
Phrase Search
-- Exact phrase (words must be adjacent)
SELECT title FROM articles
WHERE search_vector @@ phraseto_tsquery('english', 'full text search');
-- Words within N positions of each other
SELECT title FROM articles
WHERE search_vector @@ to_tsquery('english', 'full search');
Prefix Matching (Autocomplete)
SELECT title FROM articles
WHERE search_vector @@ to_tsquery('english', 'postg:*');
-- Matches: postgresql, postgres, postgis, etc.
Websearch-Style Queries (PostgreSQL 11+)
-- Allows natural search syntax
SELECT title FROM articles
WHERE search_vector @@ websearch_to_tsquery('english', 'docker compose -swarm');
-- Equivalent to: 'docker' & 'compos' & !'swarm'
SELECT title FROM articles
WHERE search_vector @@ websearch_to_tsquery('english', '"exact phrase" OR alternative');
Ranking and Highlighting
SELECT
title,
ts_rank_cd(search_vector, query, 32) AS rank, -- 32 = normalize by document length
ts_headline('english', body, query,
'StartSel=, StopSel=, MaxFragments=3, FragmentDelimiter= ... '
) AS snippet
FROM articles, websearch_to_tsquery('english', 'kubernetes deployment') AS query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 10;
Weight Configuration
PostgreSQL supports four weight levels (A, B, C, D) that influence ranking:
-- Default weights: A=1.0, B=0.4, C=0.2, D=0.1
-- Custom weights: {D, C, B, A} array
SELECT ts_rank(search_vector, query, 0) AS default_rank,
ts_rank('{0.1, 0.2, 0.8, 1.0}', search_vector, query) AS custom_rank
FROM articles, to_tsquery('english', 'search') AS query
WHERE search_vector @@ query;
Custom Dictionaries and Configurations
-- Create a synonym dictionary
CREATE TEXT SEARCH DICTIONARY synonyms (
TEMPLATE = synonym,
SYNONYMS = my_synonyms -- references /usr/share/postgresql/tsearch_data/my_synonyms.syn
);
-- Create custom text search configuration
CREATE TEXT SEARCH CONFIGURATION custom_english (COPY = english);
ALTER TEXT SEARCH CONFIGURATION custom_english
ALTER MAPPING FOR asciiword WITH synonyms, english_stem;
Performance Optimization
-- Use GIN index (preferred for most cases)
CREATE INDEX idx_search_gin ON articles USING GIN (search_vector);
-- GiST index (smaller, good for frequently updated tables)
CREATE INDEX idx_search_gist ON articles USING GiST (search_vector);
-- Partial index for published articles only
CREATE INDEX idx_search_published ON articles USING GIN (search_vector)
WHERE is_published = true;
-- Check index usage
EXPLAIN (ANALYZE, BUFFERS)
SELECT title FROM articles WHERE search_vector @@ to_tsquery('english', 'postgresql');
Full-Text Search vs. LIKE vs. Trigram
- Full-text search — best for natural language search with stemming and ranking; understands word forms
- LIKE/ILIKE — simple pattern matching; no indexing support for leading wildcards
- pg_trgm (trigram) — best for fuzzy matching and typo tolerance; use with GIN/GiST indexes
-- Combine full-text search with trigram similarity for typo tolerance
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_title_trgm ON articles USING GIN (title gin_trgm_ops);
SELECT title, similarity(title, 'postgrsql') AS sim
FROM articles
WHERE title % 'postgrsql' -- trigram similarity > threshold
ORDER BY sim DESC;
Production Best Practices
- Always use a stored tsvector column with a GIN index — never compute tsvector on the fly in WHERE clauses
- Use weights (A-D) to boost title matches over body matches in search results
- Use
websearch_to_tsqueryfor user-facing search — it handles syntax errors gracefully - Consider
unaccentextension for accent-insensitive search - For large datasets (millions of rows), partition the table and create per-partition indexes
- Monitor index size and VACUUM frequency — GIN indexes can bloat with frequent updates