SQL – Query Optimisation
A query that works is not the same as a query that performs. As data volumes grow, poorly written queries become the primary source of application slowness, database overload, and infrastructure cost.
Profile before optimising
Use EXPLAIN ANALYZE before and after any optimisation to verify the actual improvement. The query planner sometimes surprises you — what looks inefficient may already be optimised, and what looks efficient may not be.
Understanding Query Execution Order
Understanding how the database processes a query prevents common mistakes — filtering too late, selecting too much, or ordering before filtering:
1. FROM + JOIN → identify tables and produce the working set
2. WHERE → filter rows (run as early as possible)
3. GROUP BY → aggregate the filtered rows
4. HAVING → filter the aggregated groups
5. SELECT → choose which columns to return
6. DISTINCT → remove duplicates (if present)
7. ORDER BY → sort the result set
8. LIMIT/OFFSET → slice the result
The practical implication: filters in WHERE are applied before SELECT. You cannot use a SELECT alias in a WHERE clause. Filter on aggregate results with HAVING, not WHERE.
Finding Slow Queries in Production
Before optimising anything, identify what is actually slow. Never guess — measure.
pg_stat_statements
Enable pg_stat_statements in your PostgreSQL config to track cumulative query statistics across all executions:
-- Enable in postgresql.conf (requires restart)
-- shared_preload_libraries = 'pg_stat_statements'
-- Then in your database
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- ✅ Find the top 10 slowest queries by total execution time
SELECT
ROUND(total_exec_time::numeric, 2) AS total_ms,
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
calls,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms,
rows,
LEFT(query, 120) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- ✅ Find queries with the highest average execution time (worst single calls)
SELECT
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
calls,
ROUND(total_exec_time::numeric, 2) AS total_ms,
LEFT(query, 120) AS query_preview
FROM pg_stat_statements
WHERE calls > 10 -- ignore one-off queries
ORDER BY mean_exec_time DESC
LIMIT 10;
-- ✅ Find queries doing the most sequential scans (missing indexes)
SELECT
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
calls,
LEFT(query, 120) AS query_preview
FROM pg_stat_statements
WHERE query ILIKE '%seq scan%' -- only works if EXPLAIN is embedded
ORDER BY mean_exec_time DESC;
-- Reset statistics after a schema change or index addition
SELECT pg_stat_statements_reset();
Identifying Missing Indexes
-- ✅ Tables with the most sequential scans — candidates for new indexes
SELECT
schemaname,
relname AS table_name,
seq_scan,
seq_tup_read,
idx_scan,
seq_tup_read / seq_scan AS avg_rows_per_scan
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;
-- ✅ Find unused indexes — candidates for removal
SELECT
schemaname,
tablename,
indexname,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%pkey' -- keep primary keys
ORDER BY pg_relation_size(indexrelid) DESC;
Indexing
An index on the right column turns a full table scan into a targeted lookup.
Create Indexes on High-Frequency Filter Columns
-- ✅ Index on the FK column — every join on user_id benefits
CREATE INDEX orders_user_id_idx ON orders(user_id);
-- ✅ Index on a frequently filtered status column
CREATE INDEX orders_status_idx ON orders(order_status);
-- ✅ Composite index — covers queries filtering by status then sorted by date
CREATE INDEX orders_status_created_at_idx
ON orders(order_status, created_at DESC);
-- ✅ Partial index — only indexes rows matching the condition
-- Far smaller and faster than a full index when most rows are in other states
CREATE INDEX orders_pending_created_at_idx
ON orders(created_at)
WHERE order_status = 'pending';
-- ✅ Covering index — includes extra columns to enable index-only scans
-- The query SELECT id, total_amount WHERE user_id = X can be answered
-- entirely from the index without touching the table
CREATE INDEX orders_user_id_covering_idx
ON orders(user_id)
INCLUDE (id, total_amount, order_status);
Composite Index Column Order Matters
Put the most selective column first. A composite index (order_status, created_at) helps queries that filter on order_status. It does not help queries that filter only on created_at:
-- This index helps: WHERE order_status = 'pending' ORDER BY created_at
-- This index helps: WHERE order_status = 'pending'
-- This does NOT help: WHERE created_at > '2026-01-01' alone
CREATE INDEX orders_status_created_at_idx ON orders(order_status, created_at);
-- For queries on created_at alone, a separate index is needed:
CREATE INDEX orders_created_at_idx ON orders(created_at DESC);
Building Indexes Without Locking
In production, create indexes with CONCURRENTLY to avoid blocking reads and writes:
-- ✅ Non-blocking — takes longer but does not lock the table
CREATE INDEX CONCURRENTLY orders_user_id_idx ON orders(user_id);
-- ❌ Without CONCURRENTLY — locks the table for the duration (dangerous on large tables)
CREATE INDEX orders_user_id_idx ON orders(user_id);
Don't Over-Index
Every index slows down INSERT, UPDATE, and DELETE operations — the database must maintain all indexes on every write. Review indexes after adding them and remove unused ones:
-- Find indexes that have never been used
SELECT
schemaname,
tablename,
indexname,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%pkey'
ORDER BY pg_relation_size(indexrelid) DESC;
Avoid SELECT *
SELECT * fetches every column — including large TEXT/JSONB columns you do not need. It increases I/O, network transfer, and memory usage, and it prevents index-only scans.
-- ❌ Fetches all columns including blob data
SELECT * FROM products WHERE category_id = $1;
-- ✅ Fetch only what you need — enables index-only scans on covering indexes
SELECT id, name, price, stock_quantity
FROM products
WHERE category_id = $1
AND is_active = TRUE;
Use SARGable Predicates
A predicate is SARGable when the database can use an index to evaluate it. Wrapping an indexed column in a function makes it non-SARGable — the index is bypassed.
-- ❌ Non-SARGable — function on the indexed column; full table scan
WHERE LOWER(email) = 'alice@example.com'
WHERE YEAR(created_at) = 2026
WHERE EXTRACT(YEAR FROM created_at) = 2026
WHERE CAST(user_id AS TEXT) = '123'
WHERE total_amount + 10 > 100
-- ✅ SARGable equivalents — index can be used
WHERE email = LOWER('Alice@Example.com') -- normalise the input, not the column
WHERE created_at BETWEEN '2026-01-01' AND '2026-12-31 23:59:59.999'
WHERE user_id = 123 -- compare same type, no cast
WHERE total_amount > 90 -- move the constant to the right
-- ✅ For case-insensitive search — use a functional index
CREATE INDEX users_email_lower_idx ON users(LOWER(email));
-- Now SARGable:
WHERE LOWER(email) = LOWER($1);
The N+1 Query Problem
N+1 is the most common application-level database performance problem. It occurs when one query fetches a list of N records, then N additional queries fetch related data for each record.
-- ❌ Application code causing N+1 (pseudocode)
-- 1 query to get orders:
SELECT id, user_id FROM orders WHERE order_status = 'confirmed' LIMIT 20;
-- Then for each of 20 orders — 20 more queries:
SELECT * FROM users WHERE id = $order.user_id; -- runs 20 times
-- Total: 21 queries instead of 1
-- ✅ Fix with a JOIN — single query
SELECT
o.id AS order_id,
o.total_amount,
u.id AS user_id,
u.email AS user_email,
u.first_name
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.order_status = 'confirmed'
ORDER BY o.created_at DESC
LIMIT 20;
-- ✅ Fix with IN clause for ORM-friendly eager loading
-- First query — get the order IDs
SELECT id, user_id, total_amount FROM orders WHERE order_status = 'confirmed' LIMIT 20;
-- Second query — load all related users at once
SELECT id, email, first_name FROM users WHERE id IN (1, 2, 3, ..., 20);
-- Total: 2 queries regardless of the number of orders
When using an ORM, always use eager loading (include/with/JOIN FETCH) for relationships you know you will access. Never load relationships lazily in a loop.
Avoid SELECT DISTINCT as a Bug Fix
DISTINCT is often used to hide a bug — duplicate rows caused by a missing JOIN condition or incorrect aggregation:
-- ❌ DISTINCT masking a JOIN problem
SELECT DISTINCT o.id, o.total_amount
FROM orders o
JOIN order_items oi ON oi.order_id = o.id;
-- Returns duplicates because each order has multiple items
-- DISTINCT hides this instead of fixing the underlying JOIN
-- ✅ Fix the query — use EXISTS if you only need the order row
SELECT o.id, o.total_amount
FROM orders o
WHERE EXISTS (
SELECT 1 FROM order_items oi WHERE oi.order_id = o.id
);
Filter Early — Push WHERE Clauses Down
Apply the most selective filters as early as possible to reduce the working set before joining, grouping, or sorting:
-- ❌ Late filtering — joins all rows, then filters
SELECT o.id, o.total_amount, u.email
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.order_status = 'confirmed'
AND o.created_at >= '2026-01-01';
-- ✅ Early filtering via CTE — reduces the join input
WITH recent_confirmed AS (
SELECT id, user_id, total_amount
FROM orders
WHERE order_status = 'confirmed'
AND created_at >= '2026-01-01'
)
SELECT rc.id, rc.total_amount, u.email
FROM recent_confirmed rc
JOIN users u ON u.id = rc.user_id;
LIMIT and Keyset Pagination
Always add LIMIT to queries on large tables in application code. An unbounded query is a DoS vector.
-- ❌ No LIMIT — potentially returns millions of rows
SELECT id, email, created_at FROM users ORDER BY created_at DESC;
-- ✅ Offset pagination — correct but gets slower as offset grows
SELECT id, email, created_at
FROM users
ORDER BY created_at DESC
LIMIT $1 -- page_size (e.g. 20)
OFFSET $2; -- (page - 1) * page_size
-- ✅ Keyset pagination — constant performance regardless of page number
-- First page:
SELECT id, created_at FROM orders ORDER BY created_at DESC LIMIT 20;
-- Subsequent pages — use the last row's value as the cursor:
SELECT id, created_at
FROM orders
WHERE created_at < $last_seen_created_at
ORDER BY created_at DESC
LIMIT 20;
IN vs EXISTS
-- ✅ EXISTS — short-circuits on first match, efficient for large tables
SELECT id, email
FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.user_id = u.id
AND o.order_status = 'confirmed'
);
-- ✅ IN with a small static list — fine
WHERE order_status IN ('pending', 'confirmed', 'shipped')
-- ❌ NOT IN with NULLs — returns no rows if the subquery contains ANY NULL
WHERE user_id NOT IN (SELECT manager_id FROM employees);
-- If any manager_id is NULL, this returns ZERO rows
-- ✅ Safe alternative to NOT IN
WHERE NOT EXISTS (
SELECT 1 FROM employees e WHERE e.manager_id = u.id
)
Window Functions vs Correlated Subqueries
Window functions are almost always faster than correlated subqueries for row-level calculations alongside aggregates:
-- ❌ Correlated subquery — executes once per row, very slow on large tables
SELECT
o.id,
o.user_id,
o.total_amount,
(
SELECT SUM(total_amount)
FROM orders o2
WHERE o2.user_id = o.user_id
) AS user_lifetime_total
FROM orders o;
-- ✅ Window function — single pass over the data
SELECT
id,
user_id,
total_amount,
SUM(total_amount) OVER (PARTITION BY user_id) AS user_lifetime_total,
RANK() OVER (PARTITION BY user_id ORDER BY total_amount DESC) AS order_rank
FROM orders;
-- ✅ Running total with a window frame
SELECT
id,
created_at,
total_amount,
SUM(total_amount) OVER (
PARTITION BY user_id
ORDER BY created_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM orders;
Batch Inserts
Insert multiple rows in a single statement. One-at-a-time inserts create N transactions and N network round-trips:
-- ❌ One-at-a-time inserts
INSERT INTO products (sku, name, price) VALUES ('SKU-001', 'Widget A', 9.99);
INSERT INTO products (sku, name, price) VALUES ('SKU-002', 'Widget B', 14.99);
-- ✅ Batch insert — single transaction, single round-trip
INSERT INTO products (sku, name, price)
VALUES
('SKU-001', 'Widget A', 9.99),
('SKU-002', 'Widget B', 14.99),
('SKU-003', 'Widget C', 19.99);
-- ✅ UPSERT — insert or update atomically (no separate SELECT + INSERT)
INSERT INTO product_inventory (product_id, warehouse_id, quantity)
VALUES ($1, $2, $3)
ON CONFLICT (product_id, warehouse_id)
DO UPDATE SET
quantity = product_inventory.quantity + EXCLUDED.quantity,
updated_at = NOW();
-- ✅ COPY for bulk loads of thousands of rows — fastest option
COPY products (sku, name, price)
FROM '/path/to/products.csv'
WITH (FORMAT csv, HEADER true);
EXPLAIN ANALYZE — Verify Before Deploying
Always run EXPLAIN ANALYZE on any new query on a table with significant data before deploying:
-- Full analysis — run on a representative dataset in staging
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
o.id,
o.total_amount,
u.email
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.order_status = 'confirmed'
AND o.created_at >= NOW() - INTERVAL '30 days';
What to look for:
| Term | Meaning | Action |
|---|---|---|
Seq Scan |
Full table scan — every row read | Add an index on the filter column |
Index Scan |
Used an index | Verify it is the right index |
Index Only Scan |
Served entirely from the index | Best outcome — no action |
Nested Loop |
OK for small sets, slow for large | Check index coverage |
Hash Join |
Efficient for large joins | Normal — no action |
High rows= estimate |
Large intermediate set | Add earlier filters or partial index |
Buffers: shared hit=N |
Pages read from shared buffer cache | High is good |
Buffers: shared read=N |
Pages read from disk | High indicates cache miss — check shared_buffers |
-- Refresh statistics when the query plan looks wrong
ANALYZE orders;
ANALYZE; -- updates all tables in the database
Connection Pooling
Every opened database connection consumes memory (~5–10 MB per connection in PostgreSQL). Applications that open a new connection per request will exhaust the database at moderate load.
Use PgBouncer (or equivalent) in front of PostgreSQL for all production services:
# pgbouncer.ini — example configuration
[databases]
cygnus_production = host=db.internal port=5432 dbname=cygnus_production
[pgbouncer]
pool_mode = transaction # transaction pooling — recommended for most apps
max_client_conn = 500 # maximum simultaneous application connections
default_pool_size = 20 # actual connections to PostgreSQL per database
server_idle_timeout = 600
Pool sizing guidelines:
| Setting | Guidance |
|---|---|
max_client_conn |
Total concurrent application connections (number of app instances × connections per instance) |
default_pool_size |
Typically 2–4× the number of CPU cores on the DB server (PostgreSQL can only execute one query per core in parallel) |
pool_mode |
transaction for most apps; session only if you use session-level features (advisory locks, SET LOCAL, prepared statements) |
-- Check active connections in PostgreSQL
SELECT
client_addr,
state,
COUNT(*) AS connection_count
FROM pg_stat_activity
WHERE datname = current_database()
GROUP BY client_addr, state
ORDER BY connection_count DESC;
-- Check for long-running queries (> 30 seconds)
SELECT
pid,
now() - query_start AS duration,
state,
LEFT(query, 100) AS query_preview
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start < now() - INTERVAL '30 seconds'
ORDER BY duration DESC;
VACUUM and Table Bloat
PostgreSQL uses MVCC (Multi-Version Concurrency Control) — deleted and updated rows are not immediately removed from disk. VACUUM reclaims this space. AUTOVACUUM runs automatically, but for high-write tables it may need tuning.
-- Check for tables with high dead tuple counts (candidates for manual VACUUM)
SELECT
schemaname,
relname AS table_name,
n_dead_tup AS dead_tuples,
n_live_tup AS live_tuples,
ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1)
AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
-- Manual VACUUM for a specific table (non-blocking)
VACUUM ANALYZE orders;
-- Aggressive vacuum — rewrites the table to reclaim maximum space (locks table briefly)
VACUUM FULL ANALYZE orders;
-- Monitor table bloat — identifies tables where physical size far exceeds live data size
SELECT
relname AS table_name,
pg_size_pretty(pg_total_relation_size(oid)) AS total_size,
pg_size_pretty(pg_relation_size(oid)) AS table_size,
pg_size_pretty(pg_total_relation_size(oid)
- pg_relation_size(oid)) AS index_size
FROM pg_class
WHERE relkind = 'r'
ORDER BY pg_total_relation_size(oid) DESC
LIMIT 20;
Performance Quick Reference
| Technique | When to apply | Expected benefit |
|---|---|---|
| Index on FK column | Always — on every FK column | Prevents full scans on every join |
| Partial index | Filter condition appears frequently | Smaller index, faster lookups |
Covering index (INCLUDE) |
Query selects specific columns | Enables index-only scan |
CREATE INDEX CONCURRENTLY |
Production tables | Avoids locking reads/writes |
| Keyset pagination | Lists > 1000 rows | Constant-time page fetches |
| Fix N+1 with JOIN or IN | Loops calling the DB | Reduces N+1 queries to 1 or 2 |
| Window functions | Aggregates needed alongside row data | Replaces correlated subqueries |
| Batch insert | Inserting multiple rows | N transactions → 1 transaction |
UPSERT (ON CONFLICT) |
Insert-or-update patterns | Eliminates separate SELECT |
pg_stat_statements |
Identifying slow queries in production | Data-driven optimisation |
| PgBouncer | High-concurrency apps | Prevents connection exhaustion |