Skip to main content
PostgreSQL Performance Optimization: From Slow to Lightning Fast

PostgreSQL Performance Optimization: From Slow to Lightning Fast

Backend21 de diciembre de 202516 min de lectura0 vistas
PostgreSQLDatabasePerformanceOptimizationSQLBackend
Compartir:

PostgreSQL Performance Optimization: From Slow to Lightning Fast

Is your PostgreSQL database slow? Are queries taking seconds instead of milliseconds? Let's fix that! This guide covers everything from indexing to query optimization.

Why PostgreSQL Performance Matters

A slow database affects your entire application:

  • 😤 Poor user experience
  • 💸 Higher infrastructure costs
  • 📉 Reduced scalability
  • ⚠️ Increased error rates

The Performance Optimization Framework

1. Measure First, Optimize Later

-- Enable query timing \timing -- Analyze query performance EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

Understanding EXPLAIN ANALYZE Output:

  • Seq Scan: Table scan (BAD for large tables)
  • Index Scan: Using index (GOOD)
  • Cost: Estimated query cost
  • Actual time: Real execution time

Indexing Strategies

B-Tree Indexes (Default)

-- Single column index CREATE INDEX idx_users_email ON users(email); -- Composite index (order matters!) CREATE INDEX idx_orders_user_date ON orders(user_id, created_at); -- Partial index (for specific conditions) CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;

When B-Tree Indexes Work Best:

  • Equality comparisons: WHERE email = 'test@example.com'
  • Range queries: WHERE created_at > '2024-01-01'
  • Sorting: ORDER BY created_at DESC

Hash Indexes (Equality Only)

CREATE INDEX idx_users_email_hash ON users USING HASH(email);

Use when:

  • Only equality comparisons (=)
  • No range queries needed
  • Slightly faster than B-Tree for exact matches

GIN Indexes (Full-Text Search & Arrays)

-- For JSONB columns CREATE INDEX idx_user_metadata ON users USING GIN(metadata); -- For arrays CREATE INDEX idx_user_tags ON users USING GIN(tags); -- For full-text search CREATE INDEX idx_posts_search ON posts USING GIN(to_tsvector('english', title || ' ' || content));

Query with GIN index:

SELECT * FROM posts WHERE to_tsvector('english', title || ' ' || content) @@ to_tsquery('english', 'postgresql & performance');

GiST Indexes (Geometric & Range Types)

-- For geometric data CREATE INDEX idx_locations ON places USING GIST(location); -- For range types CREATE INDEX idx_booking_period ON bookings USING GIST(booking_period);

Query Optimization Techniques

Use SELECT Specific Columns

Bad:

SELECT * FROM users WHERE id = 1;

Good:

SELECT id, name, email FROM users WHERE id = 1;

Avoid N+1 Queries

Bad (N+1):

-- First query SELECT * FROM posts; -- Then for each post SELECT * FROM comments WHERE post_id = ?;

Good (JOIN):

SELECT posts.*, json_agg(comments.*) as comments FROM posts LEFT JOIN comments ON comments.post_id = posts.id GROUP BY posts.id;

Use EXISTS Instead of COUNT

Bad:

SELECT COUNT(*) FROM orders WHERE user_id = 123; -- Returns: 1 (but scans all rows)

Good:

SELECT EXISTS(SELECT 1 FROM orders WHERE user_id = 123); -- Returns: true (stops at first match)

Batch Operations

Bad:

INSERT INTO users (name, email) VALUES ('John', 'john@example.com'); INSERT INTO users (name, email) VALUES ('Jane', 'jane@example.com'); INSERT INTO users (name, email) VALUES ('Bob', 'bob@example.com');

Good:

INSERT INTO users (name, email) VALUES ('John', 'john@example.com'), ('Jane', 'jane@example.com'), ('Bob', 'bob@example.com');

Connection Pooling

Without Pooling (Slow)

// Creates new connection every time const client = await pg.connect(); const result = await client.query('SELECT * FROM users'); await client.end();

With Pooling (Fast)

import { Pool } from 'pg'; const pool = new Pool({ host: 'localhost', database: 'mydb', max: 20, // Maximum connections idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000, }); // Reuses existing connections const result = await pool.query('SELECT * FROM users');

Optimal Pool Size:

connections = ((core_count * 2) + effective_spindle_count)

Example: 4 cores + 1 disk = (4 * 2) + 1 = 9 connections

Caching Strategies

Application-Level Caching (Redis)

import Redis from 'ioredis'; const redis = new Redis(); async function getUser(id) { // Try cache first const cached = await redis.get(`user:${id}`); if (cached) return JSON.parse(cached); // Cache miss - query database const user = await db.query('SELECT * FROM users WHERE id = $1', [id]); // Store in cache (expire after 1 hour) await redis.setex(`user:${id}`, 3600, JSON.stringify(user)); return user; }

Materialized Views

-- Create materialized view CREATE MATERIALIZED VIEW user_stats AS SELECT user_id, COUNT(*) as total_orders, SUM(total) as total_spent, AVG(total) as avg_order_value FROM orders GROUP BY user_id; -- Create index on materialized view CREATE INDEX idx_user_stats_user_id ON user_stats(user_id); -- Refresh periodically (cron job) REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

Vacuum and Analyze

-- Analyze tables (updates statistics) ANALYZE users; -- Vacuum (reclaim storage) VACUUM users; -- Vacuum analyze (both) VACUUM ANALYZE users; -- Auto-vacuum configuration (postgresql.conf) autovacuum = on autovacuum_max_workers = 3 autovacuum_naptime = 1min

Monitor and Maintain

Find Slow Queries

-- Enable slow query logging (postgresql.conf) log_min_duration_statement = 1000 -- Log queries > 1 second -- Check currently running queries SELECT pid, now() - query_start as duration, query, state FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC; -- Kill long-running query SELECT pg_terminate_backend(pid);

Find Missing Indexes

SELECT schemaname, tablename, seq_scan, seq_tup_read, idx_scan, seq_tup_read / seq_scan as avg_seq_tup_read FROM pg_stat_user_tables WHERE seq_scan > 0 ORDER BY seq_tup_read DESC LIMIT 10;

Find Unused Indexes

SELECT schemaname, tablename, indexname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) as size FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE '%_pkey' ORDER BY pg_relation_size(indexrelid) DESC;

Performance Checklist

  • ✅ Add indexes on frequently queried columns
  • ✅ Use connection pooling
  • ✅ Implement caching layer
  • ✅ Use EXPLAIN ANALYZE on slow queries
  • ✅ Batch insert/update operations
  • ✅ Regular VACUUM ANALYZE
  • ✅ Monitor slow query logs
  • ✅ Use appropriate index types
  • ✅ Avoid SELECT *
  • ✅ Use prepared statements

Real-World Example: E-commerce Orders

Before Optimization:

-- Slow: Sequential scan (2.5 seconds for 1M rows) SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

After Optimization:

-- Add composite index CREATE INDEX idx_orders_user_status ON orders(user_id, status); -- Fast: Index scan (8ms for 1M rows) SELECT id, total, created_at FROM orders WHERE user_id = 123 AND status = 'pending';

Performance Gain: 312x faster! 🚀

Conclusion

PostgreSQL performance optimization is a journey:

  1. Measure performance with EXPLAIN ANALYZE
  2. Add indexes where needed
  3. Optimize queries to use indexes
  4. Implement caching for frequently accessed data
  5. Monitor continuously and adjust

Start with the biggest bottlenecks first. A single optimized query can transform your application's performance.

Remember: Premature optimization is evil, but measured optimization is awesome! 📊✨

Conectemos

¿Listo para construir algo increíble juntos?

Envíanos un mensaje

🚀

Hablemos.

Cuéntame sobre tu proyecto.

Creemos algo juntos 🤝

Visit my social profile and get connected