PostgreSQL Performance Optimization: From Slow to Lightning Fast

Is your PostgreSQL database slow? Are queries taking seconds instead of milliseconds? Let's fix that! This guide covers everything from indexing to query optimization.

Why PostgreSQL Performance Matters

A slow database affects your entire application:

😤 Poor user experience
💸 Higher infrastructure costs
📉 Reduced scalability
⚠️ Increased error rates

The Performance Optimization Framework

1. Measure First, Optimize Later

-- Enable query timing
\timing

-- Analyze query performance
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'test@example.com';

Understanding EXPLAIN ANALYZE Output:

Seq Scan: Table scan (BAD for large tables)
Index Scan: Using index (GOOD)
Cost: Estimated query cost
Actual time: Real execution time

Indexing Strategies

B-Tree Indexes (Default)

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (order matters!)
CREATE INDEX idx_orders_user_date
ON orders(user_id, created_at);

-- Partial index (for specific conditions)
CREATE INDEX idx_active_users
ON users(email)
WHERE is_active = true;

When B-Tree Indexes Work Best:

Equality comparisons: WHERE email = 'test@example.com'
Range queries: WHERE created_at > '2024-01-01'
Sorting: ORDER BY created_at DESC

Hash Indexes (Equality Only)

CREATE INDEX idx_users_email_hash
ON users USING HASH(email);

Use when:

Only equality comparisons (=)
No range queries needed
Slightly faster than B-Tree for exact matches

GIN Indexes (Full-Text Search & Arrays)

-- For JSONB columns
CREATE INDEX idx_user_metadata
ON users USING GIN(metadata);

-- For arrays
CREATE INDEX idx_user_tags
ON users USING GIN(tags);

-- For full-text search
CREATE INDEX idx_posts_search
ON posts USING GIN(to_tsvector('english', title || ' ' || content));

Query with GIN index:

SELECT * FROM posts
WHERE to_tsvector('english', title || ' ' || content)
@@ to_tsquery('english', 'postgresql & performance');

GiST Indexes (Geometric & Range Types)

-- For geometric data
CREATE INDEX idx_locations
ON places USING GIST(location);

-- For range types
CREATE INDEX idx_booking_period
ON bookings USING GIST(booking_period);

Query Optimization Techniques

Use SELECT Specific Columns

❌ Bad:

SELECT * FROM users WHERE id = 1;

✅ Good:

SELECT id, name, email FROM users WHERE id = 1;

Avoid N+1 Queries

❌ Bad (N+1):

-- First query
SELECT * FROM posts;

-- Then for each post
SELECT * FROM comments WHERE post_id = ?;

✅ Good (JOIN):

SELECT
  posts.*,
  json_agg(comments.*) as comments
FROM posts
LEFT JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id;

Use EXISTS Instead of COUNT

❌ Bad:

SELECT COUNT(*) FROM orders WHERE user_id = 123;
-- Returns: 1 (but scans all rows)

✅ Good:

SELECT EXISTS(SELECT 1 FROM orders WHERE user_id = 123);
-- Returns: true (stops at first match)

Batch Operations

❌ Bad:

INSERT INTO users (name, email) VALUES ('John', 'john@example.com');
INSERT INTO users (name, email) VALUES ('Jane', 'jane@example.com');
INSERT INTO users (name, email) VALUES ('Bob', 'bob@example.com');

✅ Good:

INSERT INTO users (name, email) VALUES
  ('John', 'john@example.com'),
  ('Jane', 'jane@example.com'),
  ('Bob', 'bob@example.com');

Connection Pooling

Without Pooling (Slow)

// Creates new connection every time
const client = await pg.connect();
const result = await client.query('SELECT * FROM users');
await client.end();

With Pooling (Fast)

import { Pool } from 'pg';

const pool = new Pool({
  host: 'localhost',
  database: 'mydb',
  max: 20,              // Maximum connections
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

// Reuses existing connections
const result = await pool.query('SELECT * FROM users');

Optimal Pool Size:

connections = ((core_count * 2) + effective_spindle_count)

Example: 4 cores + 1 disk = (4 * 2) + 1 = 9 connections

Caching Strategies

Application-Level Caching (Redis)

import Redis from 'ioredis';

const redis = new Redis();

async function getUser(id) {
  // Try cache first
  const cached = await redis.get(`user:${id}`);
  if (cached) return JSON.parse(cached);

  // Cache miss - query database
  const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);

  // Store in cache (expire after 1 hour)
  await redis.setex(`user:${id}`, 3600, JSON.stringify(user));

  return user;
}

Materialized Views

-- Create materialized view
CREATE MATERIALIZED VIEW user_stats AS
SELECT
  user_id,
  COUNT(*) as total_orders,
  SUM(total) as total_spent,
  AVG(total) as avg_order_value
FROM orders
GROUP BY user_id;

-- Create index on materialized view
CREATE INDEX idx_user_stats_user_id ON user_stats(user_id);

-- Refresh periodically (cron job)
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

Vacuum and Analyze

-- Analyze tables (updates statistics)
ANALYZE users;

-- Vacuum (reclaim storage)
VACUUM users;

-- Vacuum analyze (both)
VACUUM ANALYZE users;

-- Auto-vacuum configuration (postgresql.conf)
autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 1min

Monitor and Maintain

Find Slow Queries

-- Enable slow query logging (postgresql.conf)
log_min_duration_statement = 1000  -- Log queries > 1 second

-- Check currently running queries
SELECT
  pid,
  now() - query_start as duration,
  query,
  state
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY duration DESC;

-- Kill long-running query
SELECT pg_terminate_backend(pid);

Find Missing Indexes

SELECT
  schemaname,
  tablename,
  seq_scan,
  seq_tup_read,
  idx_scan,
  seq_tup_read / seq_scan as avg_seq_tup_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 10;

Find Unused Indexes

SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

Performance Checklist

✅ Add indexes on frequently queried columns
✅ Use connection pooling
✅ Implement caching layer
✅ Use EXPLAIN ANALYZE on slow queries
✅ Batch insert/update operations
✅ Regular VACUUM ANALYZE
✅ Monitor slow query logs
✅ Use appropriate index types
✅ Avoid SELECT *
✅ Use prepared statements

Real-World Example: E-commerce Orders

Before Optimization:

-- Slow: Sequential scan (2.5 seconds for 1M rows)
SELECT * FROM orders
WHERE user_id = 123
AND status = 'pending';

After Optimization:

-- Add composite index
CREATE INDEX idx_orders_user_status
ON orders(user_id, status);

-- Fast: Index scan (8ms for 1M rows)
SELECT
  id,
  total,
  created_at
FROM orders
WHERE user_id = 123
AND status = 'pending';

Performance Gain: 312x faster! 🚀

Conclusion

PostgreSQL performance optimization is a journey:

Measure performance with EXPLAIN ANALYZE
Add indexes where needed
Optimize queries to use indexes
Implement caching for frequently accessed data
Monitor continuously and adjust

Start with the biggest bottlenecks first. A single optimized query can transform your application's performance.

Remember: Premature optimization is evil, but measured optimization is awesome! 📊✨

PostgreSQL Performance Optimization: From Slow to Lightning Fast