Versions Database Migration Guide

Database: heatwave_versions
Table: versions
Current Size: 162 GB (263M rows)
Last Updated: November 27, 2025

Overview

This guide covers the migrations to optimize the versions table (PaperTrail audit trail).

Key Feature: Background Schema Migrations (online_migrations)

Index creation on 263M rows takes 2-5 hours. We use the online_migrations gem's Background Schema Migrations feature:

  1. Migration runs during deploy → enqueues background migration → deploy completes in seconds
  2. BackgroundSchemaMigrationWorker runs → picks up migration → creates index via CREATE INDEX CONCURRENTLY
  3. No table locks, no deploy blocking

Key insight: We pass connection_class_name: "RecordVersion" to tell online_migrations to use the versions database connection instead of the default ActiveRecord::Base.

Migrations

1. Drop Unused Columns (20251127010000)

Metric Value
Migration 20251127010000_drop_unused_columns_from_versions.rb
Deploy Time < 1 minute
Background No (instant)
Table Lock Brief (catalog update only)
Risk Low

Columns Dropped:

  • user_agent_id - 99.9% NULL, essentially unused
  • referer - Not referenced in RecordVersion model

Two-Step Safe Removal (online_migrations best practice):

  1. Deploy 1: Add columns to ignored_columns in RecordVersion model
    self.ignored_columns += %w[user_agent_id referer]
    
  2. Deploy 2: Run the migration to drop columns

The migration verifies columns are in ignored_columns before dropping.

Why It's Fast: PostgreSQL DROP COLUMN doesn't physically remove data; it marks the column as invisible. Space is reclaimed during VACUUM.

2. Add Whodunnit Index (20251127010100)

Metric Value
Migration 20251127010100_add_whodunnit_index_to_versions.rb
Deploy Time < 1 second (enqueues worker)
Background Time 2-4 hours
Table Lock None (CONCURRENTLY)
Index Size ~3-4 GB
Risk Low

Index Details:

CREATE INDEX CONCURRENTLY idx_versions_whodunnit 
ON versions (whodunnit) 
WHERE whodunnit IS NOT NULL;

Why Partial Index: 44% of rows have NULL whodunnit (background jobs). Partial index saves ~40% space.

3. Add Item Type Index (20251127010200)

Metric Value
Migration 20251127010200_add_item_type_index_to_versions.rb
Deploy Time < 1 second (enqueues worker)
Background Time 3-5 hours
Table Lock None (CONCURRENTLY)
Index Size ~5-6 GB
Risk Low

Index Details:

CREATE INDEX CONCURRENTLY idx_versions_item_type 
ON versions (item_type);

Running Migrations

Prerequisites

  1. Ensure you have access to production versions database
  2. Run during off-peak hours (recommended: overnight or weekend)
  3. Have monitoring ready for index creation progress

Commands

# Check migration status
bundle exec rake versions:db:migrate:status

# Run migrations (development)
bundle exec rake versions:db:migrate

# For production, run each migration individually via psql
# to monitor progress and handle any issues

Production Execution (Recommended)

For production, run migrations directly via psql to monitor progress:

# 1. Connect to production versions database
VERSIONS_CONN=$(op read "op://IT/heatwave-prod-db/connection-string")
# Modify to use heatwave_versions database
VERSIONS_CONN="${VERSIONS_CONN/\/heatwave\?/\/heatwave_versions?}"
psql "$VERSIONS_CONN"

# 2. Drop unused columns (fast)
ALTER TABLE versions DROP COLUMN IF EXISTS user_agent_id;
ALTER TABLE versions DROP COLUMN IF EXISTS referer;

# 3. Create whodunnit index (2-4 hours)
CREATE INDEX CONCURRENTLY idx_versions_whodunnit 
ON versions (whodunnit) 
WHERE whodunnit IS NOT NULL;

# 4. Create item_type index (3-5 hours)
CREATE INDEX CONCURRENTLY idx_versions_item_type 
ON versions (item_type);

Monitoring Progress

While index creation is running:

-- Check progress
SELECT 
  a.query,
  p.phase,
  p.blocks_total,
  p.blocks_done,
  ROUND(100.0 * p.blocks_done / p.blocks_total, 2) as percent_complete
FROM pg_stat_progress_create_index p
JOIN pg_stat_activity a ON p.pid = a.pid;

-- Or use rake task
bundle exec rake versions:index_progress

Rollback Plan

Drop Columns (Cannot be undone with data)

If you need the columns back:

ALTER TABLE versions ADD COLUMN user_agent_id integer;
ALTER TABLE versions ADD COLUMN referer text;
-- Data is permanently lost

Drop Indexes (Safe to re-create)

-- If index creation fails or needs to be stopped
DROP INDEX CONCURRENTLY IF EXISTS idx_versions_whodunnit;
DROP INDEX CONCURRENTLY IF EXISTS idx_versions_item_type;

-- Then re-run migration when ready

Handling Failed CONCURRENTLY Index

If CREATE INDEX CONCURRENTLY fails partway through:

-- Check for invalid indexes
SELECT indexname, indexdef 
FROM pg_indexes 
WHERE tablename = 'versions' 
  AND indexname LIKE 'idx_versions%';

-- Look for indexes marked as INVALID
SELECT indexrelid::regclass as index_name, indisvalid
FROM pg_index
WHERE NOT indisvalid;

-- Drop invalid index
DROP INDEX CONCURRENTLY idx_versions_whodunnit;

-- Re-create
CREATE INDEX CONCURRENTLY idx_versions_whodunnit 
ON versions (whodunnit) 
WHERE whodunnit IS NOT NULL;

Verification

After migrations complete:

# Check table health
bundle exec rake versions:health

# Verify indexes exist
psql -c "SELECT indexname, pg_size_pretty(pg_relation_size(indexname::regclass)) 
         FROM pg_indexes WHERE tablename = 'versions';"

Expected Results

Metric Before After
Columns 12 10
Indexes 4 6
Index Size 17 GB ~25 GB
Query: by whodunnit Full scan Index scan
Query: by item_type Full scan Index scan

Timeline Estimate

Phase Deploy Time Background Time
Drop columns < 1 min N/A (instant)
Whodunnit index < 1 sec 2-4 hours (Sidekiq)
Item type index < 1 sec 3-5 hours (Sidekiq)
Total Deploy < 2 min
Total Background 5-9 hours

online_migrations Setup

Worker

The BackgroundSchemaMigrationWorker runs the scheduler:

# Located at: app/workers/background_schema_migration_worker.rb
# Queue: low_priority
# Runs: OnlineMigrations.run_background_schema_migrations

Configuration

# config/initializers/online_migrations.rb
OnlineMigrations.configure do |config|
  config.background_schema_migrations.max_attempts = 5
  config.background_schema_migrations.statement_timeout = 6.hours
end

Multi-Database Support

The key to supporting the versions database is connection_class_name:

# In migration
add_index_in_background :versions, :whodunnit,
                        connection_class_name: "RecordVersion",  # Uses versions DB!
                        name: :idx_versions_whodunnit,
                        where: "whodunnit IS NOT NULL"

Since RecordVersion has establish_connection :"versions_#{Rails.env}", online_migrations uses the versions database connection automatically.

Manual Execution

If you need to run index creation manually:

# In Rails console - trigger the scheduler
OnlineMigrations.run_background_schema_migrations

# Or run a specific migration
migration = OnlineMigrations::BackgroundSchemaMigrations::Migration.find_by(migration_name: "idx_versions_whodunnit")
migration.run

Or directly in psql (for production without Rails):

-- Connect to heatwave_versions database
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_versions_whodunnit 
ON versions (whodunnit) 
WHERE whodunnit IS NOT NULL;

Troubleshooting

Index creation taking too long

  • Check for blocking queries: SELECT * FROM pg_stat_activity WHERE state != 'idle';
  • Check disk I/O: iostat -x 1
  • Consider increasing maintenance_work_mem temporarily

Out of disk space

Index creation requires temporary space. Ensure at least 20 GB free on the PostgreSQL data volume.

Connection to versions database fails

# Test connection
op read "op://IT/heatwave-prod-db/connection-string" | \
  sed 's|/heatwave?|/heatwave_versions?|' | \
  xargs psql -c "SELECT 1"

Related Documentation