PostgreSQL 18 Production Tuning (updated 2026-06-15)

Date: November 26, 2025 (originally); re-baselined 2026-06-15 for the Latitude.sh fleet
Server: dal-latitude-heatwave-01 (Dallas primary, tailnet 100.123.47.52) + Chicago standby
Host: Latitude.sh f4.metal.medium (bare metal)
CPU: AMD EPYC 4564P, 16 cores / 32 threads
OS: Ubuntu 26.04 LTS
RAM: 192 GB (187 GiB usable)
PostgreSQL: 18.4
Storage: NVMe (ZFS)

Re-baselined 2026-06-15. Earlier revisions of this doc were written for the now-decommissioned
Vultr box chi-vultr-heatwave-db4 (Ubuntu 22.04, ~32 GB RAM) and its memory-sizing math assumed
~32 GB. The live PostgreSQL 18 hosts are bare-metal Latitude.sh f4.metal.medium with 192 GB RAM.
Every numeric value below has been replaced with the value actually read from the Dallas primary
(pg_settings) on 2026-06-15. Sections still tied to the old host (the apt/systemctl "Deployment"
flow, the /etc/postgresql/... file path) are explicitly marked legacy.

Overview

This document explains the reasoning behind each tuned setting, read live from the Dallas primary
(dal-latitude-heatwave-01) on 2026-06-15. PostgreSQL runs in the heatwave-postgres Kamal
accessory, so the tuning lives in postgresql.conf inside the data dir, not in a host conf.d/
include.

File location — container PGDATA is /var/lib/postgresql/data, bind-mounted from the host ZFS
path /data/prod-replica/data (config/deploy.yml: - /data/prod-replica/data:/var/lib/postgresql/data):

  • inside the heatwave-postgres container: /var/lib/postgresql/data/postgresql.conf
  • on the Dallas host: /data/prod-replica/data/postgresql.conf

The legacy host path /etc/postgresql/18/main/conf.d/custom.conf applied only to the decommissioned
Vultr box.


Connection Settings

Setting Value Default Reasoning
listen_addresses '*' localhost Accept connections from all interfaces (Tailscale, etc.)
max_connections 200 100 Support Rails connection pool + Sidekiq workers
idle_in_transaction_session_timeout 5 min disabled Critical: Prevents abandoned transactions from blocking autovacuum and holding locks
idle_session_timeout 1 hour disabled Clean up completely idle connections
statement_timeout 10 min disabled Prevent runaway queries from consuming resources indefinitely
tcp_keepalives_* 60/10/6 system Detect dead connections faster (6 minutes vs system default ~2 hours)

Why idle_in_transaction_session_timeout Matters

Long-running transactions prevent autovacuum from cleaning up dead tuples. A single forgotten BEGIN without COMMIT can cause massive table bloat over time.


Memory Settings

Values below are the live settings read from the Dallas primary on 2026-06-15.

Setting Value Default Reasoning
shared_buffers 32 GB 128 MB ~17% of 192 GB - primary database cache. On a 192 GB box the marginal benefit past ~32 GB is small for this OLTP workload; the OS page cache (covered by effective_cache_size) carries the rest
effective_cache_size 120 GB 4 GB ~63% of RAM - tells the planner how much OS cache is realistically available for index/heap pages
maintenance_work_mem 4 GB 64 MB Speeds up VACUUM, CREATE INDEX, ALTER TABLE
work_mem 96 MB 4 MB Per-operation memory for sorts/hashes (careful: multiplied by connections × operations)
autovacuum_work_mem 1 GB use maintenance Dedicated memory for autovacuum - doesn't compete with user queries
huge_pages try try Use huge pages if available (reduces TLB misses)
hash_mem_multiplier 2.0 2.0 Hash operations can use 2x work_mem

Staging runs tuned down on the shared Dallas box: shared_buffers=8GB,
effective_cache_size=24GB (see kamal/MANAGING.md). The values
above are the production primary.

Memory Calculation

Total RAM: 192 GB (187 GiB usable)
shared_buffers:       32 GB  (~17% of RAM)
effective_cache_size: 120 GB (~63% of RAM — planner hint, not an allocation)
Remaining for OS + per-connection work_mem + page cache headroom

Per-connection worst case: work_mem × active operations
200 connections × 96 MB = ~18.75 GB (theoretical max, rarely hit;
hash nodes can take up to hash_mem_multiplier × work_mem each)

shared_buffers (32 GB) and effective_cache_size (120 GB) are an explicit allocation; the gap up
to 192 GB is left for the OS page cache plus the per-connection work_mem fan-out above. This is a
deliberate move away from the old ~32 GB-box "25% / 75% of RAM" split — at 192 GB a literal 25/75 split
(48 GB / 144 GB) would over-commit shared_buffers for an OLTP workload while leaving too little
headroom for page cache + connection memory, so the live box pins absolute sizes instead.


Query Planner (NVMe-Optimized)

Values below are the live settings read from the Dallas primary on 2026-06-15. The local NVMe (under
ZFS) makes the low-random_page_cost / high-effective_io_concurrency stance even more appropriate
than it was on the old SATA-SSD Vultr box.

Setting Value Default Reasoning
random_page_cost 1.1 4.0 NVMe SSDs have nearly identical random vs sequential read cost
effective_io_concurrency 200 1 NVMe can service many concurrent I/O requests
default_statistics_target 100 100 Left at the default; raise per-column with ALTER TABLE … ALTER COLUMN … SET STATISTICS where a specific plan needs finer histograms
jit off on JIT compilation often hurts OLTP workloads - adds latency to simple queries

Why Disable JIT?

JIT (Just-In-Time compilation) compiles query expressions to native code. Benefits:

  • Complex analytical queries with many expressions
  • Queries running for seconds/minutes

Drawbacks:

  • Compilation overhead (10-100ms) on every query
  • Rails apps mostly run simple, fast queries
  • Can cause 2-3x slowdown on typical CRUD operations

Recommendation: Keep jit = off for Rails/OLTP. Enable selectively per-session for analytics.


Parallelism

Values below are the live settings read from the Dallas primary on 2026-06-15.

Setting Value Default Reasoning
max_worker_processes 16 8 Total background workers available — sized to the 16-core / 32-thread EPYC 4564P
max_parallel_workers 16 8 Workers available for parallel queries
max_parallel_workers_per_gather 4 2 Max workers per parallel query
max_parallel_maintenance_workers 4 2 Workers for CREATE INDEX, VACUUM
min_parallel_table_scan_size 4 MB 8 MB Use parallelism for smaller tables
min_parallel_index_scan_size 256 kB 512 kB Use parallelism for smaller indexes

max_worker_processes / max_parallel_workers were lifted from 8 to 16 on the Latitude box
(32 hardware threads vs the old Vultr box's lower core count), so multiple parallel queries can run
concurrently without starving the global worker pool. max_parallel_maintenance_workers,
min_parallel_table_scan_size, and min_parallel_index_scan_size were not re-read in the 2026-06-15
live snapshot — the values shown are the intended postgresql.conf settings; confirm against the
running server if a specific plan depends on them.

Parallelism Strategy

With a large primary database and a separate ~100+ GB heatwave_versions audit DB, parallel queries
significantly speed up:

  • Full table scans
  • Large aggregations
  • Index creation
  • VACUUM operations

Write-Ahead Log (WAL)

Values below are the live settings read from the Dallas primary on 2026-06-15 (checkpoint_timeout
was not in that snapshot — see note).

Setting Value Default Reasoning
wal_level replica replica Required for streaming replication
wal_log_hints on off Required for pg_rewind (easy failover recovery)
wal_buffers 64 MB ~4 MB Larger buffer for write-heavy workloads
wal_compression lz4 off PG15+ feature - reduces WAL I/O by ~50%
min_wal_size 2 GB 80 MB Keep more WAL files to reduce recycling overhead
max_wal_size 8 GB 1 GB Allow more WAL before a forced checkpoint — fewer checkpoint-driven I/O spikes under write bursts
checkpoint_timeout 15 min 5 min Less frequent checkpoints = less I/O spikes (intended value; not re-read in the 2026-06-15 snapshot)
checkpoint_completion_target 0.9 0.9 Spread checkpoint I/O over 90% of interval

min_wal_size (2 GB) and max_wal_size (8 GB) were both raised vs the old box (1 GB / 4 GB), giving
the WAL more room before a checkpoint is forced — appropriate for the larger NVMe volume and the
write-heavy primary feeding the cross-DC standby.

WAL Compression Benefits

wal_compression = lz4 (PostgreSQL 15+):

  • Compresses full-page writes in WAL
  • Reduces WAL volume by 40-60%
  • Faster replication to standby
  • Less disk I/O
  • Minimal CPU overhead (lz4 is very fast)

Autovacuum (Critical Section)

Values below are the intended autovacuum settings in postgresql.conf. The 2026-06-15 live snapshot
read only the memory/planner/WAL/parallelism settings, so none of the autovacuum values in the
table below were re-verified that day
— confirm any you depend on with SHOW <name>; on the live
box: autovacuum_max_workers, autovacuum_naptime, autovacuum_vacuum_cost_delay,
autovacuum_vacuum_cost_limit, autovacuum_vacuum_scale_factor, autovacuum_analyze_scale_factor,
vacuum_buffer_usage_limit.

Setting Value Default Reasoning
autovacuum_max_workers 5 3 More workers for a large database with many tables
autovacuum_naptime 30s 1 min Check for vacuum-eligible tables more often
autovacuum_vacuum_cost_delay 0 2 ms No throttling - let vacuum run at full speed
autovacuum_vacuum_cost_limit 2000 200 10x more work per cycle before sleeping
autovacuum_vacuum_scale_factor 0.05 0.2 Vacuum when 5% dead tuples (not 20%)
autovacuum_analyze_scale_factor 0.025 0.1 Analyze when 2.5% rows changed
vacuum_buffer_usage_limit 1 GB 256 kB Critical - 4000x more memory for vacuum!

Why These Autovacuum Settings?

Problem Discovered (historical — diagnosed on the old Vultr box, November 26, 2025): analysis
revealed severe table bloat. The aggressive autovacuum settings below were the fix and remain in
force on the Latitude box:

Table Dead Tuples Live Tuples Bloat % Ever Vacuumed?
subscribers 332,361 6,706 4,956% Never
line_items 490,518 127,216 386% Never
communications 277,040 173,090 160% Never

Root Causes:

  1. vacuum_buffer_usage_limit = 256kB (default) - vacuum could only use 256 KB of memory!
  2. autovacuum_vacuum_cost_delay = 2ms - vacuum was heavily throttled
  3. autovacuum_vacuum_scale_factor = 0.2 - large tables needed 20% dead tuples to trigger
  4. Only 3 autovacuum workers for many tables

Solution: Aggressive autovacuum settings that:

  • Remove all throttling (cost_delay = 0)
  • Provide adequate memory (vacuum_buffer_usage_limit = 1GB)
  • Trigger earlier (scale_factor = 0.05)
  • Run more workers (max_workers = 5)

Logging

Setting Value Default Reasoning
log_min_duration_statement 1000 disabled Log queries taking > 1 second
log_checkpoints on on Track checkpoint frequency and duration
log_lock_waits on off Log when queries wait for locks > deadlock_timeout
log_temp_files 10 MB disabled Alert when queries spill to disk
log_autovacuum_min_duration 0 10 min Log ALL autovacuum activity (critical for monitoring)
log_connections on off Track connection patterns
log_disconnections on off Track disconnection patterns

Monitoring Autovacuum

With log_autovacuum_min_duration = 0, every autovacuum run is logged:

LOG: automatic vacuum of table "heatwave.public.subscribers": ...
     tuples: 6706 removed, 6706 remain, 0 are dead but not yet removable
     buffer usage: 1234 hits, 567 misses, 89 dirtied
     avg read rate: 12.345 MB/s, avg write rate: 1.234 MB/s

Statistics & Monitoring

Setting Value Default Reasoning
track_io_timing on off Track I/O time in EXPLAIN ANALYZE
track_wal_io_timing on off Track WAL write timing
track_functions pl none Track PL/pgSQL function execution
track_activity_query_size 4096 1024 Capture longer queries in pg_stat_activity
compute_query_id on auto Required for pg_stat_statements query grouping

Extensions

Required Extensions

The following extensions must be installed for full functionality:

-- Core monitoring (loaded via shared_preload_libraries)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Hypothetical indexes for AI-powered index tuning (used by postgres-mcp)
CREATE EXTENSION IF NOT EXISTS hypopg;

-- Table/index reorganization without locks
CREATE EXTENSION IF NOT EXISTS pg_repack;
Extension Purpose Requires Restart?
pg_stat_statements Query performance tracking Yes (shared_preload)
auto_explain Automatic query plan logging Yes (shared_preload)
hypopg Test hypothetical indexes without creating them No
pg_repack Online table/index defragmentation No

Installing Extensions

On the live Latitude hosts PostgreSQL runs in the heatwave-postgres Kamal accessory, and the image
already ships hypopg and pg_repack alongside the in-core pg_stat_statements / auto_explain, so
there is nothing to apt install on the host — just CREATE EXTENSION as shown above.

The host-level package install below was the legacy Vultr (Ubuntu 22.04) method, kept for reference:

# pg_stat_statements and auto_explain are included with PostgreSQL
# hypopg and pg_repack needed separate packages on the bare-host install:

apt update
apt install postgresql-18-hypopg postgresql-18-repack

pg_stat_statements

Tracks query statistics for performance analysis:

SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Setting Value Reasoning
pg_stat_statements.max 10000 Track more unique queries
pg_stat_statements.track all Track all statements including nested
pg_stat_statements.track_utility on Track DDL, VACUUM, etc.
pg_stat_statements.track_planning on Include planning time

auto_explain

Automatically logs execution plans for slow queries:

Setting Value Reasoning
auto_explain.log_min_duration 1000 Log plans for queries > 1 second
auto_explain.log_format json Machine-parseable format
auto_explain.log_analyze on Include actual row counts
auto_explain.log_buffers on Include buffer usage
auto_explain.log_timing off Reduces overhead (timing per-node is expensive)
auto_explain.log_settings on Show non-default settings affecting query

Scheduled Maintenance (Cron Jobs)

In addition to autovacuum, scheduled maintenance jobs ensure comprehensive database health.

Crontab Configuration

Location: /var/spool/cron/crontabs/root or crontab -e as root

# ─────────────────────────────────────────────────────────────────────────────
# heatwave database (main application - 65 GB)
# ─────────────────────────────────────────────────────────────────────────────
0 2 * * * su -p -s /bin/false -c "vacuumdb -e -v -z heatwave" postgres 2>&1 | mail -s "vacuum report: heatwave" dba@warmlyyours.com
0 8 * * 6 su -p -s /bin/false -c "pg_repack -d heatwave -j 4" postgres 2>&1 | mail -s "pg_repack report: heatwave" dba@warmlyyours.com

# ─────────────────────────────────────────────────────────────────────────────
# heatwave_versions database (audit trail - 162 GB)
# ─────────────────────────────────────────────────────────────────────────────
0 4 * * * su -p -s /bin/false -c "vacuumdb -e -v -z heatwave_versions" postgres 2>&1 | mail -s "vacuum report: heatwave_versions" dba@warmlyyours.com
0 8 * * 0 su -p -s /bin/false -c "pg_repack -d heatwave_versions -j 4" postgres 2>&1 | mail -s "pg_repack report: heatwave_versions" dba@warmlyyours.com

Containerized deployment (systemd timers)

On the Kamal box PostgreSQL runs in the heatwave-postgres accessory, so maintenance runs
inside the container (the image ships vacuumdb + pg_repack) driven by systemd timers
on the host — cron isn't installed (systemd-first box), so this replaces the root crontab above.
Same schedule, same dba@ reports via the host postfix→SendGrid relay.

Runs on the Dallas primary only. vacuumdb / pg_repack are write operations and cannot
run on the Chicago read-only standby (it just replicates the primary's cleaned-up state). The
timers are provisioned on each box but do real work only on whichever node is currently primary.

  • Script: /usr/local/sbin/pg-maintenance.sh {vacuum-main|vacuum-versions|repack-main|repack-versions}
    docker exec -u postgres heatwave-postgres vacuumdb|pg_repack … 2>&1 | mail -s "… [<host>]" dba@warmlyyours.com
  • Units: pg-maintenance@.service (templated) + pg-maintenance-{vacuum-main,vacuum-versions,repack-main,repack-versions}.timer.
  • Schedule is in UTC (the box's system TZ): vacuum 02:00 / 04:00 daily, pg_repack Sat / Sun 08:00.
  • Inspect: systemctl list-timers 'pg-maintenance-*' · run on demand: systemctl start pg-maintenance@repack-main.service.

Schedule Overview

Time Daily Saturday Sunday
2:00 AM vacuum heatwave vacuum heatwave vacuum heatwave
4:00 AM vacuum heatwave_versions vacuum heatwave_versions vacuum heatwave_versions
8:00 AM pg_repack heatwave pg_repack heatwave_versions

Job Details

Database Job Schedule Tool Purpose
heatwave Daily vacuum 2am daily vacuumdb -e -v -z Remove dead tuples, update statistics
heatwave Weekly repack Saturday 8am pg_repack -j 4 Reclaim disk space, defragment tables
heatwave_versions Daily vacuum 4am daily vacuumdb -e -v -z Remove dead tuples, update statistics
heatwave_versions Weekly repack Sunday 8am pg_repack -j 4 Reclaim disk space, defragment tables

Command Options

vacuumdb options:

  • -e / --echo: Show commands being sent to server
  • -v / --verbose: Detailed progress output
  • -z / --analyze: Update statistics after vacuum

pg_repack options:

  • -a / --all: Repack all databases
  • -j 4 / --jobs=4: Use 4 parallel workers

Why Both Autovacuum AND Scheduled Jobs?

Aspect Autovacuum Scheduled VACUUM pg_repack
Trigger Dead tuple threshold Time-based Time-based
Coverage Only tables needing it All tables All tables
Disk space Marks space reusable Marks space reusable Returns to OS
Locks table No No No
Best for Continuous maintenance Catch-all, statistics Heavy bloat

Autovacuum handles day-to-day cleanup but may fall behind during high-write periods.

Scheduled VACUUM ensures every table gets analyzed for query planner statistics, even if autovacuum didn't trigger.

pg_repack is essential because regular VACUUM only marks space as reusable within PostgreSQL—it doesn't shrink files. Over time, tables can have significant "internal fragmentation." pg_repack rebuilds tables online, returning space to the OS.

Monitoring

Check vacuum logs (live Latitude box — PG runs in the heatwave-postgres container):

# View recent vacuum activity from the container's PG log
docker logs heatwave-postgres 2>&1 | grep -i vacuum | tail -50

# Check mail reports (host postfix → SendGrid relay)
cat /var/mail/root | grep -A 100 "vacuum report"

Check if the scheduled maintenance jobs are active (systemd timers, not cron):

systemctl list-timers 'pg-maintenance-*'

Troubleshooting

Jobs not running?

  1. Live box: check the systemd timers — systemctl list-timers 'pg-maintenance-*' and
    systemctl status pg-maintenance@vacuum-main.service
  2. Check mail delivery: cat /var/mail/root
  3. (Legacy Vultr/cron host only) ensure jobs are NOT commented out and the cron daemon is up:
    crontab -l | grep -E "(vacuum|repack)" · systemctl status cron

pg_repack fails?

  1. Ensure extension is installed: CREATE EXTENSION IF NOT EXISTS pg_repack;
  2. Check for exclusive locks: long transactions can block repack
  3. Verify disk space: repack needs ~2x table size temporarily

Deployment

Apply Changes

Superseded 2026-06: prod moved to Kamal/Latitude (containerized PG18) — tuning now lives
in the postgres accessory's postgresql.conf inside the data dir (/data/prod-replica/data on
Dallas), applied with a bare kamal accessory reboot postgres (config/deploy.yml IS production). The apt/systemctl
flow below is the legacy Vultr method (host chi-vultr-heatwave-db4 is decommissioned), kept
for reference only. The live values documented in the sections above (re-read from the Dallas
primary on 2026-06-15) supersede anything implied by this legacy flow.

# Legacy Vultr method (decommissioned host):
scp custom.conf root@chi-vultr-heatwave-db4:/etc/postgresql/18/main/conf.d/

# Most settings - reload only
sudo systemctl reload postgresql@18-main

# Settings requiring restart:
# - max_worker_processes
# - autovacuum_max_workers
# - wal_buffers
# - shared_buffers
sudo systemctl restart postgresql@18-main

Verify Settings

On the live box, run psql inside the accessory:
docker exec -u postgres heatwave-postgres psql -c "<query>".

-- Check current settings
SELECT name, setting, unit, source 
FROM pg_settings 
WHERE source = 'configuration file'
ORDER BY name;

-- Check autovacuum is catching up
SELECT relname, last_autovacuum, autovacuum_count, n_dead_tup 
FROM pg_stat_user_tables 
WHERE n_dead_tup > 10000 
ORDER BY n_dead_tup DESC;

References