Skip to content

Database Migrations Guide

This guide covers best practices for writing safe database migrations in the Heatwave project, including the use of the online_migrations gem for safer PostgreSQL operations.


We use the online_migrations gem to catch unsafe PostgreSQL migrations in development and run them safely in production. This gem:

  • ✅ Detects potentially dangerous operations
  • ✅ Prevents them from running by default
  • ✅ Provides instructions and helpers for safer alternatives
  • ✅ Supports background migrations for large data changes

Configuration: config/initializers/online_migrations.rb


The gem is already installed. Tables were created via:

Terminal window
bin/rails generate online_migrations:install
bin/rails db:migrate
TablePurpose
background_migrationsTracks background data migration status
background_migration_jobsIndividual batch jobs for data migrations
background_schema_migrationsTracks background schema changes (like index creation)

The gem automatically checks for dangerous operations. When detected, migrations will fail with helpful error messages.

❌ Bad - Blocks reads/writes while rewriting entire table:

add_column :users, :admin, :boolean, default: false, null: false

✅ Good - Use the helper:

class AddAdminToUsers < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_column_with_default :users, :admin, :boolean, default: false, null: false
end
end

❌ Bad - Locks table:

add_index :items, :product_line_id

✅ Good - Create concurrently:

class AddIndexToItems < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_index :items, :product_line_id, algorithm: :concurrently
end
end

❌ Bad - Can cause errors until app reboots:

remove_column :users, :legacy_field

✅ Good - Two-step process:

Step 1: Ignore the column in the model:

class User < ApplicationRecord
self.ignored_columns += ["legacy_field"]
end

Step 2: Deploy, then remove in next migration:

class RemoveLegacyFieldFromUsers < ActiveRecord::Migration[7.0]
def change
safety_assured { remove_column :users, :legacy_field }
end
end

⚠️ Warning - The gem cannot inspect execute calls:

# This will raise an error
execute "DROP INDEX some_index"

✅ Good - Wrap in safety_assured with comment:

# Safe: DROP only removes unused index, no data modified
safety_assured do
execute "DROP INDEX some_index"
end

For operations you’ve verified are safe:

safety_assured do
# Your migration code here
end

Always add a comment explaining why it’s safe!


Use background migrations for large data changes that would be too slow in a regular migration.

  • Backfilling data on large tables (>100k rows)
  • Complex data transformations
  • Operations that could timeout in a regular migration

Create in lib/online_migrations/background_migrations/:

lib/online_migrations/background_migrations/backfill_user_slugs.rb
module OnlineMigrations
module BackgroundMigrations
class BackfillUserSlugs < OnlineMigrations::BackgroundMigration
def relation
User.where(slug: nil)
end
def process_batch(users)
users.each do |user|
user.update_columns(slug: user.name.parameterize)
end
end
# Optional: For progress tracking
delegate :count, to: :relation
end
end
end
db/migrate/20251205000001_enqueue_backfill_user_slugs.rb
class EnqueueBackfillUserSlugs < ActiveRecord::Migration[7.0]
def up
enqueue_background_data_migration(
'BackfillUserSlugs',
batch_size: 1000, # Records per job
sub_batch_size: 100, # Records per sub-batch
batch_pause: 1 # Seconds between batches
)
end
def down
remove_background_data_migration('BackfillUserSlugs')
end
end
OptionDefaultDescription
batch_size1,000Records per background job
sub_batch_size100Records per sub-batch within a job
batch_pause0Seconds to pause between batches
sub_batch_pause_ms100Milliseconds between sub-batches
batch_max_attempts5Max retries for failed batches

Background migrations are processed by OnlineMigrationsSchedulerWorker:

  1. Runs every 1 minute via Sidekiq scheduler
  2. Uses dedicated online_migrations queue (max 1 concurrent process)
  3. Processes one batch at a time
  4. Automatically pauses if database is under load (throttling)

The gem includes helpers for common operations:

# Backfill a column with a value
backfill_column_in_background(:users, :status, 'active')
# Copy data between columns
copy_column_in_background(:users, :legacy_name, :name)
# Delete orphaned records
delete_orphaned_records_in_background(:comments, :post)
# Reset counter caches
reset_counters_in_background(:posts, :comments_count)
# Perform action on relation
perform_action_on_relation_in_background(
'User',
{ verified: false },
:send_verification_email
)

If your code depends on the migration being complete:

# In a later migration
class RequireBackfilledData < ActiveRecord::Migration[7.0]
def up
ensure_background_data_migration_succeeded('BackfillUserSlugs')
# Now safe to add NOT NULL constraint
change_column_null :users, :slug, false
end
end

For long-running schema changes like creating indexes on large tables.

class AddIndexToLargeTable < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def up
add_index_in_background :items, :sku, algorithm: :concurrently
end
def down
remove_index :items, :sku
end
end

In config/initializers/online_migrations.rb:

OnlineMigrations.configure do |config|
config.background_schema_migrations.max_attempts = 5
config.background_schema_migrations.statement_timeout = 6.hours
config.background_schema_migrations.error_handler = ->(error, migration) do
Rails.logger.error "[BackgroundSchemaMigration] #{migration.migration_name} failed"
Sentry.capture_exception(error) if defined?(Sentry)
end
end

class AddForeignKey < ActiveRecord::Migration[7.0]
disable_ddl_transaction!
def change
add_foreign_key :orders, :customers, validate: false
validate_foreign_key :orders, :customers
end
end
class ChangeColumnType < ActiveRecord::Migration[7.0]
def up
# Step 1: Add new column
add_column :users, :age_new, :integer
# Step 2: Copy data (use background migration for large tables)
copy_column_in_background(:users, :age, :age_new)
end
end
# Later migration after background job completes:
class FinalizeColumnTypeChange < ActiveRecord::Migration[7.0]
def up
ensure_background_data_migration_succeeded('CopyColumn', arguments: ['users', 'age', 'age_new'])
safety_assured do
remove_column :users, :age
rename_column :users, :age_new, :age
end
end
end
class AddNotNullConstraint < ActiveRecord::Migration[7.0]
def up
# First ensure all rows have values (via background migration if large)
add_not_null_constraint :users, :email, validate: false
validate_not_null_constraint :users, :email
end
end

⚠️ Avoid if possible. If necessary:

  1. Create a view with old name pointing to new table
  2. Update code to use new name
  3. Drop view in later migration

Terminal window
# Run pending migrations
bundle exec rails db:migrate
# Rollback last migration
bundle exec rails db:rollback
# Rollback specific migration
bundle exec rails db:migrate:down VERSION=20251205000001
# Check migration status
bundle exec rails db:migrate:status
# In development/test, background migrations run inline during db:migrate
# To test manually:
bundle exec rails runner "OnlineMigrations.run_background_data_migrations"

# Find migration
migration = OnlineMigrations::BackgroundMigrations::Migration.find_by(
migration_name: 'BackfillUserSlugs'
)
# Check status
migration.status # => "enqueued", "running", "succeeded", "failed"
migration.progress # => 0.75 (75% complete)
migration.rows_count # => 50000 (total rows)
migration.started_at
migration.finished_at
# Check individual jobs
migration.migration_jobs.where(status: 'failed')
Terminal window
# Check status of all background migrations
bundle exec rake online_migrations:status
# Run background migrations manually
bundle exec rake online_migrations:run_background_migrations

The scheduler might not be running. Check:

  1. Sidekiq is running with online_migrations queue
  2. OnlineMigrationsSchedulerWorker is scheduled
# Check if scheduler is working
Sidekiq::ScheduledSet.new.select { |j| j.klass == 'OnlineMigrationsSchedulerWorker' }
# Find failed jobs
failed_jobs = migration.migration_jobs.where(status: 'failed')
failed_jobs.each do |job|
puts "Job #{job.id}: #{job.error_class} - #{job.error_message}"
end
# Retry failed jobs
migration.retry_failed_jobs

Check if database load is triggering throttling:

# In rails console
OnlineMigrations.config.throttler.call
# => true means throttling is active

HelperUse Case
add_column_with_defaultAdd column with default safely
add_index(..., algorithm: :concurrently)Create index without locking
add_foreign_key(..., validate: false)Add FK without validation
validate_foreign_keyValidate FK separately
safety_assured { }Bypass safety check (use carefully!)
enqueue_background_data_migrationQueue large data migration
ensure_background_data_migration_succeededVerify migration completed
  • Queue: online_migrations
  • Concurrency: 1 (max 1 process)
  • Scheduler: Runs every 1 minute
  • Worker: OnlineMigrationsSchedulerWorker
FilePurpose
config/initializers/online_migrations.rbGem configuration
lib/online_migrations/background_migrations/Custom background migrations
app/workers/online_migrations_scheduler_worker.rbSidekiq scheduler
config/sidekiq*.ymlQueue configuration

Migration .rb files can be deleted once they are old and universally applied — unlike Scenic db/views/*_vNN.sql files (see the migrations skill’s “keep all versions” note), which must be retained because migrations reference exact versions. The two are easy to conflate; the difference:

  • Migration files have no cross-file dependencies (no require_relative, Rake::Task[…], or load between them) and every applied version is recorded in db/structure.sql’s schema_migrations INSERT. A fresh db:schema:load reads structure.sql, so pruned versions stay marked applied; db:migrate only runs pending files. structure.sql routinely holds more version rows than there are files (file-less versions from prior prunes) — proof the project already runs this way.
  • Scenic view SQL files are read lazily by create_view … version: N / revert_to_version: N, so deleting one breaks fresh-DB replay and rollback.
  1. Pick a cutoff (everything dated ≤ N months ago). All such migrations must be applied in every live + dev DB — true for anything past one production-snapshot cycle.
  2. Confirm zero cross-file deps in the prune set: git grep -lE "require_relative|Rake::Task\[|load File" HEAD -- 'db/migrate/<range>*' (expect none). If a migration invokes a rake task or another file, retire them together (e.g. the 2024-10 import_nodes migration invoked amazon:import_amazon_browse_nodes).
  3. Confirm structure.sql records the boundary versions: grep "('<version>')" db/structure.sql.
  4. git rm the files (reversible; structure.sql is untouched).
  5. Re-point doc citations. Grep docs for the deleted paths (grep -rnoE "db/migrate/[0-9]{14}_[a-z0-9_]+\.rb" --include=*.md --include=*.md .) and update any that cite a deleted file as a canonical example to a surviving one — or inline the snippet so the next prune can’t orphan it. Historical pointers in task/feature/legacy docs can stay (git history keeps the file).

First prune (2026-06-04): removed all 713 migrations dated ≤ 2025-12 (1293 → 580; db/migrate 5.9M → 2.7M, which also trims the deployed image since migrations run from it). The migrations skill’s dev-gate precedent was re-pointed from the deleted 20250313210532_refresh_amazon_schemas… to the surviving 20260525050027_scrub_null_click_ids… per step 5.