Deploying with Kamal — Guidebook

How to ship Heatwave to the containerized stack. The entry point is
bin/deploy; read README.md first for the
architecture. For day-2 operations after a deploy, see MANAGING.md.

Production is on Kamal. bin/deploy production targets the Kamal prod
config (config/deploy.yml) and performs a normal rolling deploy — prod cut
over from Capistrano on 2026-06-07. The same bin/deploy flow drives both
staging and production (pick the env); everything below applies to both.


TL;DR

# Deploy the current branch to staging (auto-migrates, full edge purge):
bin/deploy staging

# Deploy + run migrations across both DBs (gated on prod):
bin/deploy production --migrate

# Throwaway test of uncommitted work on staging:
bin/deploy staging --allow-dirty

# Ship clean origin/master from a throwaway worktree (don't disturb your tree):
bin/deploy production --in-worktree

bin/deploy --help prints the full flag list.


Before you can deploy (one-time per machine)

  1. 1Password — be signed in to the warmlyyours.1password.com account with
    the IT vault, or drop a service-account token at
    .kamal/.op-service-account-token (gitignored — the headless-reliable path;
    see MANAGING.md → Secrets).
  2. config/master.key present locally (it backs RAILS_MASTER_KEY and the
    staging/production env-keys). Worktrees get it symlinked by bin/setup-worktree.
  3. Toolchainmise pins Ruby/Node; bin/deploy runs Kamal via
    mise exec -- bundle exec kamal. gum gives the nice prompts (optional).
  4. Validate secret resolution before your first deploy:
    mise exec -- bundle exec kamal secrets print -d staging
    

What branch ships

Kamal builds the working tree (builder.context: '.'), and for staging it
builds remotely on the box (builder.remote: ssh://deploy@100.123.47.52,
local: false) — no Mac emulation. There is no "pick a branch" prompt like the
old Capistrano bin/deploy had. Instead:

  • Default: whatever is checked out must be clean and in sync with its
    upstream
    bin/deploy hard-gates on this (require_clean_tree) so the image
    always equals a pushed commit. To deploy a different branch, check it out and push it.
  • --allow-dirty: skip the gate and ship the working tree as-is (loud warning).
    Use only for a throwaway staging test — the image will match no git commit.
  • --in-worktree: deploy clean origin/master from a throwaway worktree
    (~/.heatwave-deploy by default) without disturbing your current checkout.

Staging is fine to deploy from whatever worktree/branch you're on, as long as the
tree is clean and pushed. The clean-tree gate is the safety net that replaces
"which branch?" — what ships is always reproducible from origin.


The deploy lifecycle

sequenceDiagram
    autonumber
    participant Dev as bin/deploy
    participant Git as git
    participant OP as 1Password
    participant K as kamal
    participant Box as Box (build+host)
    participant Reg as GHCR
    participant Proxy as kamal-proxy

    Dev->>Git: require_clean_tree (clean + in sync w/ upstream)
    Dev->>OP: op_session (unlock once / SA token)
    Dev->>K: kamal build deliver [-d staging]
    K->>Box: build image (remote, builder.context = working tree; GIT_REVISION build arg → ENV APP_REVISION)
    Box->>Reg: push image (+ pull onto the hosts)
    Dev->>K: MIGRATE before the swap — app exec --primary --roles=sidekiq db:migrate (both DBs)
    Note over Dev: a failed migration aborts here — live app untouched
    Dev->>K: kamal deploy --skip-push [-d staging]
    K->>Box: pre-deploy hook → quiet Sidekiq (TSTP)
    K->>Proxy: boot new container, wait for /up (≤ deploy_timeout 90s)
    Proxy-->>K: healthy → route traffic to new, stop old
    K->>Box: post-deploy hook → clear quiet marker + reap stale containers
    K-->>Dev: deploy ok
    Dev->>K: R2 frontend-asset sync (prod: failure aborts)
    Dev->>K: detach AppSignal sourcemap upload in the web container (DELETE_MAPS)
    Dev->>K: edge-cache purge (staging: full zone · prod: none by default)
    Note over Dev: optional — edge worker (-e), bulk redirects (-r)

Step by step

  1. Clean-tree gate (require_clean_tree) — refuses a dirty or unpushed tree.
    Bypass with --allow-dirty (throwaway) or --in-worktree (clean master).
  2. 1Password unlock (op_session) — one approval up front, before the
    ~minute-long build, so a secret failure surfaces early. Reused by every op
    Kamal spawns. A service-account token skips the desktop app entirely.
  3. Build + deliver (kamal build deliver) — build on the remote builder,
    push to GHCR, pull onto the hosts. No container swap yet.
    • The deploy revision arrives as the GIT_REVISION build arg → ENV APP_REVISION (→ config.x.revision, served to the front-end via
      #page-config). Nothing revision-specific is baked into the webpack
      bundles, so the cached asset layers survive Ruby-only deploys.
  4. Migrate BEFORE the swap — always, on every deploy (no flag, no confirm;
    --skip-migrate is the rare escape hatch). A single runner
    (kamal app exec --primary --roles=sidekiq --version <HEAD> 'bin/rails db:migrate') migrates both heatwave and
    heatwave_versions on the just-delivered image. A failed migration aborts
    the deploy here — the live app keeps running old code on the old schema.
    Migrations must stay backward-compatible with the still-running old code
    (expand/contract). On a first deploy to a host (no role env-file yet) the
    migrate is deferred to right after the boot.
  5. Boot/swap (kamal deploy --skip-push) — rolling boot behind kamal-proxy
    of the already-delivered image.
    • The new container must answer /up with 200 within deploy_timeout: 90s
      (Puma preload is ~20s). kamal-proxy keeps the old container serving until then.
    • pre-deploy quiets Sidekiq (TSTP) so in-flight jobs drain during the boot
      window. Sidekiq Pro super_fetch recovers anything still running regardless.
    • post-deploy (success only) clears the quiet marker and reaps stale
      app containers.
  6. R2 frontend-asset sync — pushes this deploy's content-hashed bundles to
    the per-env heatwave-frontend-assets-* bucket (additive; never deletes;
    skips *.map + the manifest). A failed sync on production aborts
    un-synced bundles would orphan on the next deploy (the stale-chunk 404).
    On staging a failed sync only warns and the deploy continues (origin
    still serves that deploy's bundles).
  7. Sourcemap upload (post-deploy, detached in the container) — the
    .map files ride in the image; after the R2 sync, bin/deploy daemonizes
    the AppSignal upload inside the live web container (--primary, so one
    upload even with multiple web hosts; survives the local runner exiting —
    CI/agent safe), using the account-wide push key (never echoed), then
    deletes the maps from the asset volume (DELETE_MAPS=true). Its output
    goes to the container's stdout — i.e. kamal app logs --roles=web — and
    nowhere else (there is no local log file); the deploy doesn't wait
    for it.
  8. Edge-cache purgestaging purges the whole zone; production
    purges NOTHING by default
    (bundles are content-hashed + immutable on R2,
    and HTML is intentionally never flushed on a prod deploy; --purge-full
    forces a full-zone purge). A tmp/cloudflare_purge_urls.txt queue file,
    if present, is also purged.
  9. Optional-e deploys the Cloudflare edge worker; -r re-uploads bulk
    redirects. With gum, these (plus purge toggles) are a checkbox menu.

Flags

Flag Effect
staging / production Destination (else prompted). Staging adds -d staging.
--migrate Accepted but a no-op (back-compat) — migrations always run now.
--skip-migrate, --no-migrate Skip the pre-swap migration (rare escape hatch, e.g. re-boot of unchanged code).
--allow-dirty Skip the clean/in-sync gate — ship the working tree as-is.
--in-worktree[=PATH] Deploy clean origin/master from a throwaway worktree.
--purge-full Force a full-zone edge purge (prod default purges nothing).
--skip-cache-purge Deploy without any edge-cache purge.
-e, --deploy-edge-worker Also deploy the Cloudflare www-edge worker.
-r, --upload-bulk-redirects Re-upload data/cloudflare_rules/*.csv to Cloudflare.
-P, --skip-push Boot an already-pushed image (skip the build; still migrates).
-y, --yes, --non-interactive No gum menus — options from flags (auto when no TTY).

Migrations

  • Heatwave spans two databases (heatwave + heatwave_versions), so a migrate
    runs against both. bin/deploy uses kamal app exec --primary --roles=sidekiq
    so it executes on exactly one host (and keeps a heavy migration off web).
  • Never auto-run on boot — the image entrypoint does not migrate. This is a
    project hard rule (schema/data risk).
  • Both envs migrate BEFORE the swap, on every deploy — no flag, no confirm.
    The migration runs on the freshly-delivered image pinned to `--version
  • Run them standalone any time with the alias:
    mise exec -- bundle exec kamal migrate            # = app exec --reuse 'bin/rails db:migrate'
    mise exec -- bundle exec kamal app exec --primary -d staging 'bin/rails db:migrate'
    

Dev db:migrate is allowed freely (regenerates db/structure.sql); prod
migrations and any db:rollback/db:migrate:redo still require explicit
human go-ahead (see CLAUDE.md hard-block table).


Rollback

kamal-proxy keeps prior image versions, so rollback is a re-point, not a rebuild:

mise exec -- bundle exec kamal app versions -d staging     # list deployed versions
mise exec -- bundle exec kamal rollback <VERSION> -d staging

A rollback re-fires the pre-deploy/post-deploy hooks (Sidekiq is quieted then
swapped). If the schema moved forward with a deploy you're rolling back, roll
the code back first, then decide on the data — db:rollback is gated and reverts
data, so think before running it.

If a deploy fails mid-flight, bin/deploy un-quiets Sidekiq automatically
(it TERMs PID 1; Kamal's --restart unless-stopped revives a fresh fetching
process). Bare-kamal users resume with:

mise exec -- bundle exec kamal app boot --roles=sidekiq -d staging

Deploying without bin/deploy

bin/deploy is a convenience wrapper; the underlying Kamal commands work directly
(you lose the clean-tree gate, the gated-migration UX, the sourcemap upload, and the
edge purge — do those by hand):

mise exec -- bundle exec kamal deploy -d staging
mise exec -- bundle exec kamal app logs -d staging -f

See MANAGING.md for the full command surface (console, shell, dbc,
accessory boot, secrets).