It's Friday at 6:30 PM. Your query takes 4 seconds to respond in prod. The client sent you a screenshot of their loading screen. Your phone is ringing. There's a cold beer waiting for you. That's the context in which you're going to debug a slow PostgreSQL query.
Good news: 80% of PostgreSQL performance problems have an identifiable cause in under 10 minutes with the right tools. This guide follows the logical order of a real investigation — not an exhaustive PG feature catalog, but a method that works.
EXPLAIN vs EXPLAIN ANALYZE: one predicts, the other executes
The classic first mistake: using EXPLAIN alone and wondering why
the numbers don't match what you observe in prod.
EXPLAIN displays the execution plan that PostgreSQL thinks
it will use, based on the statistics it has in memory. It doesn't execute the query.
Costs are estimates, rows are predictions.
EXPLAIN ANALYZE actually executes the query and enriches the plan
with measured values: real time, actual rows processed, number of loops.
That's what you want. Warning: it actually runs the query — don't
run it on a DELETE or UPDATE without a BEGIN/ROLLBACK.
-- Safe for write queries
BEGIN;
EXPLAIN ANALYZE
UPDATE orders SET status = 'processed' WHERE created_at < NOW() - INTERVAL '30 days';
ROLLBACK;
-- For a SELECT, no special precautions needed
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.email, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at >= '2025-01-01'
GROUP BY u.email
ORDER BY order_count DESC
LIMIT 20;
The BUFFERS option is valuable: it shows how many blocks were read
from cache (shared hit) vs from disk (read). A query with
lots of disk reads on a table that should be cached is a strong signal.
What to look for in the output
Three things to spot immediately:
- Seq Scan on a large table: PostgreSQL scans every row one by one. On a 50,000-row table, this is often acceptable. On 10 million, it's not.
- Large gap between rows estimated and rows actual: if PostgreSQL expected to filter 50 rows and processed 80,000, its statistics are stale or the query is poorly written.
- Node with disproportionate time: the plan is a tree. The slowest node is the culprit. The displayed time is cumulative — subtract the time of child nodes to isolate a node's own cost.
-- Example of problematic EXPLAIN ANALYZE output
Seq Scan on orders (cost=0.00..48520.00 rows=2000 width=120)
(actual time=0.042..3841.223 rows=1847291 loops=1)
Filter: (status = 'pending' AND created_at > '2024-01-01')
Rows Removed by Filter: 152709
Planning Time: 1.2 ms
Execution Time: 4102.8 ms
-- What we see:
-- 1. Seq Scan on "orders" → no index used
-- 2. rows estimated = 2,000, rows actual = 1,847,291 → catastrophic statistics
-- 3. 4 seconds → matches exactly what the client is seeing
The usual suspects
1. Missing index on a filtered column
The most common case. An orders table with 2 million rows,
a query filtering on user_id, no index. PostgreSQL does a full Seq Scan.
The fix takes 30 seconds.
-- The slow query
SELECT id, total, status
FROM orders
WHERE user_id = 42
ORDER BY created_at DESC;
-- EXPLAIN ANALYZE shows:
-- Seq Scan on orders (actual time=0.031..2341.5 rows=47 loops=1)
-- Filter: (user_id = 42)
-- Rows Removed by Filter: 1999953
-- The fix
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);
-- CONCURRENTLY: creates the index without locking the table for writes
-- After creation, the same query:
-- Index Scan using idx_orders_user_id on orders
-- (actual time=0.041..0.189 rows=47 loops=1)
-- 2.3 seconds → 0.2 ms
2. Index exists but isn't used
A trickier situation: the index exists, but PostgreSQL doesn't use it. Three main reasons.
Function in the WHERE clause: the moment you apply a function to an indexed column, the index is unusable — PostgreSQL can't scan the index on the transformed value.
-- Index on email, but unused
CREATE INDEX idx_users_email ON users(email);
-- The index does nothing here
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- Expression index — solves the problem
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Or rewrite the query if data is already lowercase
SELECT * FROM users WHERE email = 'alice@example.com';
LIKE with leading wildcard: LIKE '%dupont' can't
use a B-tree index, because it requires scanning all values to find those that
end with "dupont". LIKE 'dupont%', on the other hand, can.
-- Guaranteed Seq Scan, even with an index on last_name
SELECT * FROM customers WHERE last_name LIKE '%martin';
-- Uses the B-tree index
SELECT * FROM customers WHERE last_name LIKE 'martin%';
-- For "contains" searches, use pg_trgm + GIN
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_customers_last_name_trgm
ON customers USING GIN(last_name gin_trgm_ops);
-- Now this LIKE uses the index
SELECT * FROM customers WHERE last_name LIKE '%martin%';
Poor cardinality: if a column has very few distinct values
(e.g. status with 3 possible values on 2 million rows), PostgreSQL
may estimate that a Seq Scan is faster than an Index Scan followed by 600,000 random
lookups. It's often right. The solution: a partial index.
-- 2 million orders, 95% have status = 'completed', 5% status = 'pending'
-- A simple index on status is not very useful for 'completed'
-- But very useful for 'pending' (100,000 rows out of 2 million)
-- Partial index: only indexes pending rows
CREATE INDEX idx_orders_pending
ON orders(created_at)
WHERE status = 'pending';
-- This query now uses the partial index
SELECT id, user_id, total
FROM orders
WHERE status = 'pending'
AND created_at > NOW() - INTERVAL '7 days'
ORDER BY created_at DESC;
3. The N+1 problem via ORM
Not specific to PostgreSQL, but PostgreSQL suffers from it as much as any other database. The ORM loads 100 articles, then for each article makes a separate query to load the author. Result: 101 queries instead of one.
-- What the ORM generates (N+1)
SELECT * FROM articles LIMIT 100;
SELECT * FROM users WHERE id = 1;
SELECT * FROM users WHERE id = 2;
-- ... 98 more times
-- What it should do
SELECT a.id, a.title, a.body, a.published_at,
u.id AS author_id, u.name AS author_name, u.avatar_url
FROM articles a
JOIN users u ON u.id = a.author_id
ORDER BY a.published_at DESC
LIMIT 100;
Detection happens in the logs or via pg_stat_statements (see below):
dozens of identical queries with just one parameter changing, executed in rapid succession.
In Go: plain SQL with a single JOIN. In Laravel/Django: eager loading.
4. Stale statistics
PostgreSQL maintains statistics on data distribution in each table (via autovacuum). If these statistics are outdated — after a large import, a bulk DELETE — the planner makes bad decisions. Symptom: large gap between rows estimated and rows actual in EXPLAIN ANALYZE.
-- Update statistics without blocking writes
ANALYZE orders;
-- After a bulk load or data purge
VACUUM ANALYZE orders;
-- Check statistics freshness
SELECT schemaname, tablename, last_analyze, last_autoanalyze, n_live_tup, n_dead_tup
FROM pg_stat_user_tables
WHERE tablename = 'orders';
5. Too many rows returned
Sometimes the query is correctly indexed but returns 50,000 rows when you're
displaying 20. Add LIMIT and paginate properly. For sorted lists
on large volumes, offset-based pagination itself becomes a problem beyond
a few thousand pages: OFFSET 100000 LIMIT 20 forces PG to scan
100,020 rows to return 20.
-- Offset pagination — slow on large pages
SELECT id, title, published_at
FROM articles
ORDER BY published_at DESC
LIMIT 20 OFFSET 10000;
-- Cursor pagination (keyset pagination) — fast regardless of page number
SELECT id, title, published_at
FROM articles
WHERE published_at < '2024-06-15 10:23:00' -- last value from the previous page
ORDER BY published_at DESC
LIMIT 20;
Choosing the right index type
PostgreSQL offers several index types. In practice, you use three of them.
B-tree (default)
Covers 95% of cases: equality, inequality, sorting, range queries. It's the default
when you write CREATE INDEX without specifying a type. Effective on
high-cardinality columns (identifiers, emails, dates, amounts).
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_users_email ON users(email);
-- Composite index: column order matters.
-- Useful for queries that filter on user_id AND sort on created_at.
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);
GIN — Full-text search and JSONB
For searches in text (tsvector), arrays, or
JSONB columns. A B-tree index on a JSONB field doesn't help for queries
like WHERE metadata @> '{"role": "admin"}'.
-- GIN index for JSONB queries
CREATE INDEX idx_users_metadata ON users USING GIN(metadata);
-- Now efficient
SELECT * FROM users WHERE metadata @> '{"role": "admin", "active": true}';
-- GIN index for full-text search
ALTER TABLE articles ADD COLUMN search_vector tsvector;
UPDATE articles SET search_vector =
to_tsvector('english', COALESCE(title, '') || ' ' || COALESCE(body, ''));
CREATE INDEX idx_articles_search ON articles USING GIN(search_vector);
SELECT title FROM articles
WHERE search_vector @@ plainto_tsquery('english', 'postgresql optimization index');
BRIN — Sequential timestamps
BRIN (Block Range INdex) is tiny in size and highly efficient for columns whose values grow naturally with insertion: timestamps, sequential IDs. The principle: instead of indexing each value, it stores min/max values per data block. On a log table of several hundred GB, it can drastically reduce query time with an index of just a few MB.
-- Log table with 500 million rows.
-- A B-tree index on created_at would weigh ~15 GB.
-- A BRIN index weighs a few MB.
CREATE INDEX idx_logs_created_at_brin ON access_logs USING BRIN(created_at);
-- Efficient for time range queries
SELECT COUNT(*), path
FROM access_logs
WHERE created_at >= '2026-02-01' AND created_at < '2026-03-01'
GROUP BY path
ORDER BY count DESC;
pg_stat_statements: finding expensive queries in production
Up to now we knew which query to analyze. In production, you often don't.
pg_stat_statements accumulates statistics on all executed queries:
total time, call count, mean time. This is where you find
queries that run 10,000 times per hour and take 200ms each — less visible
than a 4-second query, but far more impactful in aggregate.
-- Enable the extension (requires a PostgreSQL restart)
-- In postgresql.conf: shared_preload_libraries = 'pg_stat_statements'
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- The 10 most expensive queries by total time
SELECT
LEFT(query, 100) AS query_short,
calls,
ROUND(total_exec_time::numeric, 2) AS total_ms,
ROUND(mean_exec_time::numeric, 2) AS mean_ms,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- Queries with the highest standard deviation (time instability)
-- Possible symptom: different plan depending on parameters, or random cache miss
SELECT
LEFT(query, 100) AS query_short,
calls,
ROUND(mean_exec_time::numeric, 2) AS mean_ms,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms
FROM pg_stat_statements
WHERE calls > 100
ORDER BY stddev_exec_time DESC
LIMIT 10;
-- Reset stats (useful after a deployment or optimization)
SELECT pg_stat_statements_reset();
pg_stat_statements doesn't interrupt service and has no measurable
performance impact. There's no reason not to enable it in production
on all your PostgreSQL clusters.
Quick tips
Force PostgreSQL to show the plan with an index
Sometimes PostgreSQL chooses a Seq Scan when an index exists, and you want to see
what the plan would look like with the index. SET enable_seqscan = off forces
the planner to prefer index scans for the current session. Useful for diagnosis,
not as a permanent fix.
-- In your session only — no global change
SET enable_seqscan = off;
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 42;
-- Restore to normal
SET enable_seqscan = on;
-- Interpretation:
-- If the plan with index is faster → create the index or run ANALYZE (stale stats).
-- If the plan with index is slower → PostgreSQL was right to do the Seq Scan.
-- Perhaps the query returns too many rows for an index to help.
work_mem for sorts and hash joins
When you see Sort Method: external merge Disk in EXPLAIN ANALYZE,
PostgreSQL ran out of memory and had to write to disk for sorting. Increasing
work_mem for the session can fix the problem immediately.
-- Default work_mem = 4MB — insufficient for large aggregations
SET work_mem = '64MB';
EXPLAIN ANALYZE
SELECT user_id, SUM(total) AS revenue
FROM orders
WHERE created_at >= '2025-01-01'
GROUP BY user_id
ORDER BY revenue DESC;
-- With sufficient work_mem:
-- Sort Method: quicksort Memory: 12kB ← all in RAM, fast
-- Without sufficient work_mem:
-- Sort Method: external merge Disk: 48MB ← writes to disk, slow
-- Warning: don't increase work_mem globally without calculation.
-- It applies PER sort operation AND per concurrent connection.
-- 100 connections × 5 sorts × 64MB = potentially 32 GB consumed.
-- Reserve it for sessions that need it, or adjust max_connections.
VACUUM ANALYZE after a bulk load
-- After an import or bulk DELETE
VACUUM ANALYZE orders;
-- VACUUM reclaims space from dead tuples (after UPDATE/DELETE)
-- ANALYZE updates planner statistics
-- Both together: essential after any large data modification
-- Check table bloat
SELECT n_dead_tup, n_live_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct
FROM pg_stat_user_tables
WHERE tablename = 'orders';
-- If dead_pct > 10-20%, a VACUUM is needed
The method in summary
When a PostgreSQL query is slow, here's the investigation order that resolves the vast majority of cases:
- EXPLAIN (ANALYZE, BUFFERS) on the offending query. Identify the slowest node. Look for Seq Scan on a large table, gap between rows estimated / actual.
-
Missing index? Create it with
CONCURRENTLYin production. Verify the query is written to benefit from it — no function on the indexed column, noLIKE '%...'. -
Stale statistics?
VACUUM ANALYZEon the affected table. -
In production with no specific query to target?
pg_stat_statementsto identify offenders by total time or call count.
80% of PostgreSQL performance problems are solved with EXPLAIN ANALYZE and a well-placed index. The remaining 20% involve configuration tuning (shared_buffers, work_mem, max_connections), rewriting complex queries, or table partitioning — but that's a topic for another article.
The beer can wait another 10 minutes. The client can't.