PostgreSQL Server Setup & Operations Guide

Superseded (2026-06-10): prod is now containerized PG18 on Kamal/Latitude
(Postgres accessory), not apt-installed PostgreSQL on Vultr. The db4/db3 Vultr
hosts, the postgresql-16 apt packages, the /etc/postgresql/16/ paths, and the
Tailscale-IP failover dance below are all decommissioned and kept only as a
historical reference. For current ops see doc/infrastructure/PGBOUNCER.md, the
postgres-replication skill, and config/deploy.*.yml.

Table of Contents


Emergency Failover

If your primary server stops responding, follow these steps.

Server Information

Server Role Public IP Tailscale IP
heatwave-db4 Primary 45.63.79.22 100.65.0.14
heatwave-db3 Replica 66.42.114.16 100.68.87.143

Network Usage:

  • Public IPs: Server-to-server communication (app servers, replication)
  • Tailscale IPs (100.*): Developer access, MCP connections, remote administration

Future Consideration: Migrate to Vultr VPC 2.0 for private inter-server communication.

Failover Procedure

Scenario: heatwave-db4 is down, promote heatwave-db3 to primary
1. SSH to the replica (db3)

# From developer machine (via Tailscale)
ssh root@100.68.87.143

# From another server (via public IP)
ssh root@66.42.114.16

2. Verify cluster is online

pg_lsclusters

3. Promote to primary

pg_ctlcluster 16 main promote

4. Verify promotion succeeded

tail -f /var/log/postgresql/postgresql-16-main.log
# Look for: "database system is ready to accept connections"

5. Update application database connections

On heatwave-app and heatwave-util:

ssh root@heatwave-app
cd /var/www/heatwave/current/config
nano database.yml  # Change IP to new primary's public IP: 66.42.114.16
sudo service nginx restart

6. Update DNS for India's direct SQL connection

Login to DNSMadeEasy (password in IT vault).
Update the pa9zy entry to the new primary's Tailscale IP.

7. Enable maintenance cron jobs on new primary

See Maintenance section below.

8. Rebuild old primary as replica (when it comes back online)

# Stop postgres (it's out of sync)
sudo pg_ctlcluster 16 main stop

# Clear data directories
sudo rm -rf /mnt/postgresql/*
sudo rm -rf /var/lib/postgresql/16/main

# Rebuild from new primary (use screen to prevent SSH disconnect issues)
screen
su - postgres
pg_basebackup -h 66.42.114.16 -U repl -Xs -P -D /var/lib/postgresql/16/main -R --slot=db4 -C

9. Update SimpleBackups

Point to new primary: https://my.simplebackups.com/backup/8483/edit


Setup Instructions

Instructions for setting up a new PostgreSQL 16 server on Ubuntu 22.04.

1. Initial Server Setup

# Set timezone
sudo timedatectl set-timezone America/Chicago

# Update system
apt-get -y update && apt-get -y dist-upgrade && apt-get -y upgrade

# Install required packages
apt install -y vim curl wget gpg gnupg2 software-properties-common apt-transport-https lsb-release ca-certificates

2. Install Tailscale

curl -fsSL https://tailscale.com/install.sh | sh

Authenticate and note the Tailscale IP. Reconnect via Tailscale IP before enabling firewall.

# From your local machine
ssh-copy-id root@<TAILSCALE_IP>

3. Configure Firewall

Note: On Vultr, use the Vultr Firewall console and link the heatwave firewall group to block public access. The rules below are for reference if using ufw.

4. Install PostgreSQL 16

# Add official PostgreSQL repository
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg

# Update and install
apt-get update
apt-get install -y postgresql-16 postgresql-client-16 postgresql-16-repack postgresql-16-hypopg

# Verify installation
sudo -u postgres psql -c "SELECT version();"

5. Configure PostgreSQL Users

sudo su - postgres
-- Set postgres password (store in 1Password IT vault)
ALTER USER postgres WITH PASSWORD '<PASSWORD_FROM_1PASSWORD>';

-- Create deploy role (password in 1Password: IT vault > heatwave-prod-db)
CREATE ROLE deploy WITH SUPERUSER LOGIN PASSWORD '<PASSWORD_FROM_1PASSWORD>';
exit

6. Configure UFW (if not using Vultr Firewall)

# Allow PostgreSQL on private network interface
sudo ufw allow in on enp6s0 from any to any port 5432 proto tcp

# Allow PostgreSQL on Tailscale interface
sudo ufw allow in on tailscale0 from any to any port 5432 proto tcp

7. Configure PostgreSQL

Copy the optimized config from this repo:

# From your local machine
scp doc/infrastructure/database/production/custom.conf root@<SERVER>:/etc/postgresql/16/main/conf.d/

Note: Memory settings are tuned for ~32GB RAM. For different specs, use PGTune.
See doc/infrastructure/database/production/202511261630_POSTGRESQL_TUNING.md for detailed explanations.

Configure pg_hba.conf:

sudo tee /etc/postgresql/16/main/pg_hba.conf > /dev/null <<EOF
# Local connections (Unix socket)
local   all             postgres                                peer
local   all             all                                     peer

# Remote connections (all hosts - firewall handles access control)
host    all             all             0.0.0.0/0               scram-sha-256
host    all             all             ::1/128                 scram-sha-256

# Replication (add specific replica IP)
# host    replication     repl            <REPLICA_PUBLIC_IP>/32  trust
EOF

Security Note: Access control is handled by Vultr Firewall (blocks public interface) and Tailscale (authenticated network). PostgreSQL accepts all hosts because the firewall only allows trusted networks through.

8. Version Control for Config Files (Optional)

cd /etc/postgresql/16/main
git init
git config --global --add safe.directory /etc/postgresql/16/main
git config --global user.email "sysadmin@warmlyyours.com"

cat > .gitignore <<EOF
environment
pg_ctl.conf
pg_ident.conf
start.conf
EOF

git add postgresql.conf pg_hba.conf conf.d/custom.conf .gitignore
git commit -m "Initial PostgreSQL 16 setup"

9. Restart and Verify

# Restart PostgreSQL
sudo systemctl restart postgresql

# Check status
sudo systemctl status postgresql

# Check logs for errors
tail -f /var/log/postgresql/postgresql-16-main.log

# Verify settings applied
sudo -u postgres psql -c "SHOW listen_addresses;"

10. Create Databases

sudo su - postgres

createdb -O deploy heatwave
createdb -O deploy heatwave_versions

If restoring over existing database: Stop services using it first (sudo service nginx stop), then dropdb heatwave.

11. Install Extensions

sudo -u postgres psql -d heatwave -c "CREATE EXTENSION pg_stat_statements;"
sudo -u postgres psql -d heatwave -c "CREATE EXTENSION pg_repack;"
sudo -u postgres psql -d heatwave -c "CREATE EXTENSION hypopg;"

sudo -u postgres psql -d heatwave_versions -c "CREATE EXTENSION pg_stat_statements;"
sudo -u postgres psql -d heatwave_versions -c "CREATE EXTENSION pg_repack;"
sudo -u postgres psql -d heatwave_versions -c "CREATE EXTENSION hypopg;"

12. Create Legacy Directory

# Required for heatwave_versions tablespace (legacy)
mkdir -p /mnt/postgresql
chown postgres:postgres /mnt/postgresql
chmod 0700 /mnt/postgresql

13. Restore from Backup

# Copy backup file to server first, then restore (use screen to prevent SSH disconnect issues)
screen
su -c "pg_restore -v -O -Fc -d heatwave /var/lib/postgresql/heatwave.backup" postgres

Maintenance

Add crontab entries for scheduled maintenance on primary server only:

crontab -e
# ─────────────────────────────────────────────────────────────────────────────
# heatwave database (main application - 65 GB)
# ─────────────────────────────────────────────────────────────────────────────
# Daily vacuum at 2am - removes dead tuples, updates statistics
0 2 * * * su -p -s /bin/false -c "vacuumdb -e -v -z heatwave" postgres 2>&1 | mail -s "vacuum report: heatwave" dba@warmlyyours.com

# Weekly pg_repack on Saturday 8am - reclaims disk space (no table locks)
0 8 * * 6 su -p -s /bin/false -c "pg_repack -d heatwave -j 4" postgres 2>&1 | mail -s "pg_repack report: heatwave" dba@warmlyyours.com

# ─────────────────────────────────────────────────────────────────────────────
# heatwave_versions database (audit trail - 162 GB)
# ─────────────────────────────────────────────────────────────────────────────
# Daily vacuum at 4am - staggered 2 hours after heatwave
0 4 * * * su -p -s /bin/false -c "vacuumdb -e -v -z heatwave_versions" postgres 2>&1 | mail -s "vacuum report: heatwave_versions" dba@warmlyyours.com

# Weekly pg_repack on Sunday 8am - day after heatwave's Saturday run
0 8 * * 0 su -p -s /bin/false -c "pg_repack -d heatwave_versions -j 4" postgres 2>&1 | mail -s "pg_repack report: heatwave_versions" dba@warmlyyours.com

Schedule Overview:

Time Daily Saturday Sunday
2:00 AM vacuum heatwave vacuum heatwave vacuum heatwave
4:00 AM vacuum heatwave_versions vacuum heatwave_versions vacuum heatwave_versions
8:00 AM pg_repack heatwave pg_repack heatwave_versions

Important: These jobs complement autovacuum. pg_repack is essential because regular VACUUM doesn't shrink files on disk.
See doc/infrastructure/database/production/202511261630_POSTGRESQL_TUNING.md for details.


Setup Streaming Replication

On Primary Server

1. Create replication user (password in 1Password IT vault):

CREATE ROLE repl WITH LOGIN NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT REPLICATION CONNECTION LIMIT -1 PASSWORD '<PASSWORD_FROM_1PASSWORD>';

2. Update pg_hba.conf to allow replication from replica's public IP:

host    replication     repl            66.42.114.16/32         trust

3. Reload config:

sudo systemctl reload postgresql

On Replica Server

1. Stop PostgreSQL:

sudo pg_ctlcluster 16 main stop

2. Clear data directories:

# ⚠️ DANGER: Only run on replica, NEVER on primary!
rm -rf /var/lib/postgresql/16/main/*
rm -rf /mnt/postgresql/*

3. Run base backup from primary (use public IP):

screen
su - postgres
pg_basebackup -h 45.63.79.22 -U repl -Xs -P -D /var/lib/postgresql/16/main -R --slot=db3 -C

4. Start PostgreSQL:

sudo pg_ctlcluster 16 main start

Troubleshooting Replication

Replication slot already exists:

-- On primary
SELECT pg_drop_replication_slot('db3');
SELECT slot_name, slot_type, active FROM pg_replication_slots;

Check replication status:

-- On primary
SELECT client_addr, state, sent_lsn, replay_lsn,
       pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) as lag
FROM pg_stat_replication;

Manual Failover (Planned)

See Emergency Failover for unplanned outages.

1. Identify cluster:

pg_lsclusters

2. Stop primary:

pg_ctlcluster 16 main stop

3. Promote standby:

pg_ctlcluster 16 main promote
tail -f /var/log/postgresql/postgresql-16-main.log

4. Update app servers to point to new primary's public IP.


SimpleBackups

SimpleBackups needs SSH access to the database server. In production we use Vultr Firewall, but for reference here are the UFW rules:

# SimpleBackups IP addresses
sudo ufw allow from 34.227.212.80 to any port 22 comment 'Simplebackups'
sudo ufw allow from 65.108.217.1 to any port 22 comment 'Simplebackups'
sudo ufw allow from 95.217.244.243 to any port 22 comment 'Simplebackups'
sudo ufw allow from 3.83.70.104 to any port 22 comment 'Simplebackups'
sudo ufw allow from 3.66.2.188 to any port 22 comment 'Simplebackups'
sudo ufw allow from 45.55.63.67 to any port 22 comment 'Simplebackups'

# Tailscale UDP
sudo ufw allow 41641/udp

# SSH on private interfaces
sudo ufw allow in on enp6s0 from any to any port 22 proto tcp
sudo ufw allow in on tailscale0 from any to any port 22 proto tcp

Backup Configuration: https://my.simplebackups.com/backup/8483/edit


References


Last Updated: November 2025