Database Migrations Guide
This guide covers best practices for writing safe database migrations in the Heatwave project, including the use of the online_migrations gem for safer PostgreSQL operations.
Table of Contents
- Overview
- Online Migrations Gem
- Safety Checks
- Background Data Migrations
- Background Schema Migrations
- Common Migration Patterns
- Testing Migrations
- Monitoring & Troubleshooting
Overview
We use the online_migrations gem to catch unsafe PostgreSQL migrations in development and run them safely in production. This gem:
- ✅ Detects potentially dangerous operations
- ✅ Prevents them from running by default
- ✅ Provides instructions and helpers for safer alternatives
- ✅ Supports background migrations for large data changes
Configuration: config/initializers/online_migrations.rb
Online Migrations Gem
Installation
The gem is already installed. Tables were created via:
bin/rails generate online_migrations:install
bin/rails db:migrate
Key Tables
| Table | Purpose |
|---|---|
background_migrations |
Tracks background data migration status |
background_migration_jobs |
Individual batch jobs for data migrations |
background_schema_migrations |
Tracks background schema changes (like index creation) |
Safety Checks
The gem automatically checks for dangerous operations. When detected, migrations will fail with helpful error messages.
Common Dangerous Operations
1. Adding a Column with Default Value
❌ Bad - Blocks reads/writes while rewriting entire table:
add_column :users, :admin, :boolean, default: false, null: false
✅ Good - Use the helper:
class AddAdminToUsers < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_column_with_default :users, :admin, :boolean, default: false, null: false
end
end
2. Adding an Index
❌ Bad - Locks table:
add_index :items, :product_line_id
✅ Good - Create concurrently:
class AddIndexToItems < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_index :items, :product_line_id, algorithm: :concurrently
end
end
3. Removing a Column
❌ Bad - Can cause errors until app reboots:
remove_column :users, :legacy_field
✅ Good - Two-step process:
Step 1: Ignore the column in the model:
class User < ApplicationRecord
self.ignored_columns += ["legacy_field"]
end
Step 2: Deploy, then remove in next migration:
class RemoveLegacyFieldFromUsers < ActiveRecord::Migration[7.0]
def change
safety_assured { remove_column :users, :legacy_field }
end
end
4. Executing Raw SQL
⚠️ Warning - The gem cannot inspect execute calls:
# This will raise an error
execute "DROP INDEX some_index"
✅ Good - Wrap in safety_assured with comment:
# Safe: DROP only removes unused index, no data modified
safety_assured do
execute "DROP INDEX some_index"
end
Disabling Checks
For operations you've verified are safe:
safety_assured do
# Your migration code here
end
Always add a comment explaining why it's safe!
Background Data Migrations
Use background migrations for large data changes that would be too slow in a regular migration.
When to Use
- Backfilling data on large tables (>100k rows)
- Complex data transformations
- Operations that could timeout in a regular migration
Creating a Background Migration
Step 1: Create the Migration Class
Create in lib/online_migrations/background_migrations/:
# lib/online_migrations/background_migrations/backfill_user_slugs.rb
module OnlineMigrations
module BackgroundMigrations
class BackfillUserSlugs < OnlineMigrations::BackgroundMigration
def relation
User.where(slug: nil)
end
def process_batch(users)
users.each do |user|
user.update_columns(slug: user.name.parameterize)
end
end
# Optional: For progress tracking
delegate :count, to: :relation
end
end
end
Step 2: Enqueue in a Migration
# db/migrate/20251205000001_enqueue_backfill_user_slugs.rb
class EnqueueBackfillUserSlugs < ActiveRecord::Migration[7.0]
def up
enqueue_background_data_migration(
'BackfillUserSlugs',
batch_size: 1000, # Records per job
sub_batch_size: 100, # Records per sub-batch
batch_pause: 1 # Seconds between batches
)
end
def down
remove_background_data_migration('BackfillUserSlugs')
end
end
Configuration Options
| Option | Default | Description |
|---|---|---|
batch_size |
1,000 | Records per background job |
sub_batch_size |
100 | Records per sub-batch within a job |
batch_pause |
0 | Seconds to pause between batches |
sub_batch_pause_ms |
100 | Milliseconds between sub-batches |
batch_max_attempts |
5 | Max retries for failed batches |
How It Runs
Background migrations are processed by OnlineMigrationsSchedulerWorker:
- Runs every 1 minute via Sidekiq scheduler
- Uses dedicated
online_migrationsqueue (max 1 concurrent process) - Processes one batch at a time
- Automatically pauses if database is under load (throttling)
Predefined Background Migrations
The gem includes helpers for common operations:
# Backfill a column with a value
backfill_column_in_background(:users, :status, 'active')
# Copy data between columns
copy_column_in_background(:users, :legacy_name, :name)
# Delete orphaned records
delete_orphaned_records_in_background(:comments, :post)
# Reset counter caches
reset_counters_in_background(:posts, :comments_count)
# Perform action on relation
perform_action_on_relation_in_background(
'User',
{ verified: false },
:send_verification_email
)
Depending on Migrated Data
If your code depends on the migration being complete:
# In a later migration
class RequireBackfilledData < ActiveRecord::Migration[7.0]
def up
ensure_background_data_migration_succeeded('BackfillUserSlugs')
# Now safe to add NOT NULL constraint
change_column_null :users, :slug, false
end
end
Background Schema Migrations
For long-running schema changes like creating indexes on large tables.
Creating an Index in Background
class AddIndexToLargeTable < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def up
add_index_in_background :items, :sku, algorithm: :concurrently
end
def down
remove_index :items, :sku
end
end
Configuration
In config/initializers/online_migrations.rb:
OnlineMigrations.configure do |config|
config.background_schema_migrations.max_attempts = 5
config.background_schema_migrations.statement_timeout = 6.hours
config.background_schema_migrations.error_handler = ->(error, migration) do
Rails.logger.error "[BackgroundSchemaMigration] #{migration.migration_name} failed"
Sentry.capture_exception(error) if defined?(Sentry)
end
end
Common Migration Patterns
Adding a Foreign Key
class AddForeignKey < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_foreign_key :orders, :customers, validate: false
validate_foreign_key :orders, :customers
end
end
Changing Column Type
class ChangeColumnType < ActiveRecord::Migration[7.0]
def up
# Step 1: Add new column
add_column :users, :age_new, :integer
# Step 2: Copy data (use background migration for large tables)
copy_column_in_background(:users, :age, :age_new)
end
end
# Later migration after background job completes:
class FinalizeColumnTypeChange < ActiveRecord::Migration[7.0]
def up
ensure_background_data_migration_succeeded('CopyColumn', arguments: ['users', 'age', 'age_new'])
safety_assured do
remove_column :users, :age
rename_column :users, :age_new, :age
end
end
end
Adding NOT NULL Constraint
class AddNotNullConstraint < ActiveRecord::Migration[7.0]
def up
# First ensure all rows have values (via background migration if large)
add_not_null_constraint :users, :email, validate: false
validate_not_null_constraint :users, :email
end
end
Renaming a Table
⚠️ Avoid if possible. If necessary:
- Create a view with old name pointing to new table
- Update code to use new name
- Drop view in later migration
Testing Migrations
Local Testing
# Run pending migrations
bundle exec rails db:migrate
# Rollback last migration
bundle exec rails db:rollback
# Rollback specific migration
bundle exec rails db:migrate:down VERSION=20251205000001
# Check migration status
bundle exec rails db:migrate:status
Testing Background Migrations
# In development/test, background migrations run inline during db:migrate
# To test manually:
bundle exec rails runner "OnlineMigrations.run_background_data_migrations"
Monitoring & Troubleshooting
Check Background Migration Status
# Find migration
migration = OnlineMigrations::BackgroundMigrations::Migration.find_by(
migration_name: 'BackfillUserSlugs'
)
# Check status
migration.status # => "enqueued", "running", "succeeded", "failed"
migration.progress # => 0.75 (75% complete)
migration.rows_count # => 50000 (total rows)
migration.started_at
migration.finished_at
# Check individual jobs
migration.migration_jobs.where(status: 'failed')
Rake Tasks
# Check status of all background migrations
bundle exec rake online_migrations:status
# Run background migrations manually
bundle exec rake online_migrations:run_background_migrations
Common Issues
Migration Stuck in "enqueued"
The scheduler might not be running. Check:
- Sidekiq is running with
online_migrationsqueue OnlineMigrationsSchedulerWorkeris scheduled
# Check if scheduler is working
Sidekiq::ScheduledSet.new.select { |j| j.klass == 'OnlineMigrationsSchedulerWorker' }
Failed Jobs
# Find failed jobs
failed_jobs = migration.migration_jobs.where(status: 'failed')
failed_jobs.each do |job|
puts "Job #{job.id}: #{job.error_class} - #{job.}"
end
# Retry failed jobs
migration.retry_failed_jobs
Throttling Active
Check if database load is triggering throttling:
# In rails console
OnlineMigrations.config.throttler.call
# => true means throttling is active
Quick Reference
Migration Helpers
| Helper | Use Case |
|---|---|
add_column_with_default |
Add column with default safely |
add_index(..., algorithm: :concurrently) |
Create index without locking |
add_foreign_key(..., validate: false) |
Add FK without validation |
validate_foreign_key |
Validate FK separately |
safety_assured { } |
Bypass safety check (use carefully!) |
enqueue_background_data_migration |
Queue large data migration |
ensure_background_data_migration_succeeded |
Verify migration completed |
Sidekiq Configuration
- Queue:
online_migrations - Concurrency: 1 (max 1 process)
- Scheduler: Runs every 1 minute
- Worker:
OnlineMigrationsSchedulerWorker
Files
| File | Purpose |
|---|---|
config/initializers/online_migrations.rb |
Gem configuration |
lib/online_migrations/background_migrations/ |
Custom background migrations |
app/workers/online_migrations_scheduler_worker.rb |
Sidekiq scheduler |
config/sidekiq*.yml |
Queue configuration |
Pruning old migration files
Migration .rb files can be deleted once they are old and universally
applied — unlike Scenic db/views/*_vNN.sql files (see the migrations skill's
"keep all versions" note), which must be retained because migrations reference
exact versions. The two are easy to conflate; the difference:
- Migration files have no cross-file dependencies (no
require_relative,
Rake::Task[…], orloadbetween them) and every applied version is recorded
indb/structure.sql'sschema_migrationsINSERT. A freshdb:schema:load
readsstructure.sql, so pruned versions stay marked applied;db:migrateonly
runs pending files.structure.sqlroutinely holds more version rows than
there are files (file-less versions from prior prunes) — proof the project
already runs this way. - Scenic view SQL files are read lazily by
create_view … version: N/
revert_to_version: N, so deleting one breaks fresh-DB replay and rollback.
Procedure (rolling cleanup)
- Pick a cutoff (everything dated ≤ N months ago). All such migrations must be
applied in every live + dev DB — true for anything past one production-snapshot
cycle. - Confirm zero cross-file deps in the prune set:
git grep -lE "require_relative|Rake::Task\[|load File" HEAD -- 'db/migrate/<range>*'
(expect none). If a migration invokes a rake task or another file, retire them
together (e.g. the 2024-10import_nodesmigration invoked
amazon:import_amazon_browse_nodes). - Confirm
structure.sqlrecords the boundary versions:
grep "('<version>')" db/structure.sql. git rmthe files (reversible;structure.sqlis untouched).- Re-point doc citations. Grep docs for the deleted paths
(grep -rnoE "db/migrate/[0-9]{14}_[a-z0-9_]+\.rb" --include=*.md --include=*.md .)
and update any that cite a deleted file as a canonical example to a surviving
one — or inline the snippet so the next prune can't orphan it. Historical
pointers in task/feature/legacy docs can stay (git history keeps the file).
First prune (2026-06-04): removed all 713 migrations dated ≤ 2025-12
(1293 → 580; db/migrate 5.9M → 2.7M, which also trims the deployed image since
migrations run from it). The migrations skill's dev-gate precedent was re-pointed
from the deleted 20250313210532_refresh_amazon_schemas… to the surviving
20260525050027_scrub_null_click_ids… per step 5.