Documentation Platform Adoption — Astro Starlight on Cloudflare Pages + Access
Date: 2026-06-14 Status: IN PROGRESS — Phases 0–2 DONE (PR #1144); Phases 3–5 (custom domain + CF Access, polish) pending Owner: cbillen
Decision
Section titled “Decision”Adopt Astro Starlight as a single self-hosted docs portal for heatwave’s
narrative/engineering/infra/ops documentation, built in CI, deployed to
Cloudflare Pages, gated by Cloudflare Access. YARD stays for Ruby
API reference and is published as a linked /api sub-site of the same Pages
project. Authoring stays in Git/Markdown (engineers only).
Mintlify — evaluated and declined
Section titled “Mintlify — evaluated and declined”| Reason | Detail |
|---|---|
| Can’t truly self-host | The “self-hosting” offering is a self-hosted Astro frontend only; the content engine, search, AI, and build pipeline stay on Mintlify’s servers, your source is processed by them, and it is not air-gapped. Enterprise-only. (custom-frontends blog) |
| Won’t replace YARD | API reference is OpenAPI/AsyncAPI only — no Ruby source parsing. Our ~6,800 YARD annotations across 3,656 files have no path in. (OpenAPI setup) So it can’t be the single platform anyway. |
| Posture mismatch | Most docs are sensitive internal runbooks (DR steps, tailnet IPs, credential locations). Private/SSO docs are Enterprise (contact-sales), and content still lives on Mintlify. We gate Netdata/PgHero/etc. behind tailnet + CF Access — a SaaS docs host is a regression. (pricing) |
Mintlify remains the right tool if we ever ship a public developer API product. We don’t have one today; revisit then.
Current state (what we’re replacing/extending)
Section titled “Current state (what we’re replacing/extending)”- YARD pipeline is DISABLED.
.github/workflows/yard-docs.yml.disabledbuilds YARD → deploysdoc/yardto Cloudflare Pages projectheatwave-docsviacloudflare/wrangler-action@v3. Re-enable + extend it; don’t rebuild it. - 654 markdown files are unpublished. 341 in
doc/, 313 in.agents/skills/. Only the handful named in.yardoptsare pulled into YARD. The win is publishing the curateddoc/set as a real browsable site. - Secrets already wired in CI:
CLOUDFLARE_API_TOKEN,CLOUDFLARE_ACCOUNT_ID,BUNDLE_GEMS__CONTRIBSYS__COM.
Target architecture
Section titled “Target architecture”GitHub Actions (push to master; PR → CF Pages preview)├─ Ruby/YARD → bundle exec yard doc → doc/yard/├─ Node/Astro → sync doc/** → src/content/docs → docs-site/dist/│ starlight build (Pagefind search)├─ combine cp -r doc/yard → docs-site/dist/api└─ wrangler pages deploy docs-site/dist --project-name=heatwave-docs
Cloudflare Pages (heatwave-docs) → docs.warmlyyours.dev (custom domain) └─ Cloudflare Access policy (Terraform) gates the whole hostname Starlight portal at / (narrative/infra/ops docs) YARD reference at /api/ (Ruby class/method docs)One project, one domain, one Access policy. Cohesive.
Repo layout
Section titled “Repo layout”A nested, isolated Astro project — its own package.json, untouched by the
root webpack/Yarn build:
docs-site/├─ package.json # astro + @astrojs/starlight (isolated from root)├─ astro.config.mjs # starlight() integration, site, sidebar├─ src/│ ├─ content.config.ts # docs collection (docsSchema)│ └─ content/docs/ # GENERATED at build by sync script (gitignored)├─ scripts/sync-docs.mjs # doc/** → src/content/docs, normalize frontmatter└─ public/ # logo, favicon, brand cssdoc/ stays the source of truth (agents, .yardopts, and in-repo relative
links keep working). The site is a build artifact.
Content strategy
Section titled “Content strategy”Build-time sync + normalize (primary approach — robust regardless of
loader-API nuance, and required anyway because Starlight’s docsSchema()
mandates a title frontmatter field our .md files mostly lack):
scripts/sync-docs.mjs does, per file in the curated set:
- Copy
doc/<dir>/**/*.{md,MD}→docs-site/src/content/docs/<dir>/. - If frontmatter has no
title, inject one derived from the first# H1(fallback: filename → Title Case). - Rewrite intra-doc relative links (
../infrastructure/x.md→/infrastructure/x). - Leave excluded dirs out entirely.
Curation = reuse the .yardopts split (already battle-tested):
- Include (10 dirs):
architecture, deployment, development, features,frontend, infrastructure, integrations, monitoring, operations,troubleshooting+ rootREADME.md, CONVENTIONS.md, DESIGN.crm.md,DESIGN.www.md. - Exclude (working/ephemeral):
tasks, specs, analysis, sunny,refactoring, prompts, reports, bugs, legacy, and the 19 legacy.htmlindoc/tasks/.
Explicitly NOT in the portal (different audience):
.agents/skills/(313 files) — agent-targeted instructions; the agents read them from the repo. Don’t dump into a human portal. (A curated “How our skills work” overview page can link to the tree, but the skills themselves stay.)doc/tasks/(174 files) — a decision-log archive, not reference. Optionally surface later as a separate, collapsed “Decision log / ADR” section if wanted.
IA / sidebar
Section titled “IA / sidebar”Map the 10 included dirs to top-level sidebar groups; let Starlight
autogenerate within each from the file tree. Replace the hand-maintained
doc/DOCUMENTATION_INDEX.md with the site nav (keep the file as a redirect stub
or landing page). Draft top-level order:
Overview · Architecture · Development · Frontend · Features · Infrastructure · Deployment · Operations · Monitoring · Integrations · Troubleshooting · API (YARD ↗)
YARD integration
Section titled “YARD integration”- Keep
.yardoptsandyard-templates/as-is; keepstate_machines-yard,yard-activerecord,yard-activesupport-concernplugins. - After Astro build:
cp -r doc/yard docs-site/dist/api. - Sidebar gets an external link to
/api/(YARD keeps its own in-page search). - Re-enabling the build also re-lights
yard-linthygiene on touched files.
CI workflow (extends the disabled one)
Section titled “CI workflow (extends the disabled one)”New .github/workflows/docs.yml (rename/replace yard-docs.yml.disabled):
- Triggers:
push: [master];pull_requestfor CF Pages preview deploys (review doc changes on a real gated URL before merge). - Steps:
actions/checkout@v6→ruby/setup-ruby@v1(bundler-cache, sameBUNDLE_GEMS__CONTRIBSYS__COM) →yard doc→ Node setup →yarn --cwd docs-site install→node docs-site/scripts/sync-docs.mjs→yarn --cwd docs-site build→cp -r doc/yard docs-site/dist/api→wrangler-action@v3 pages deploy docs-site/dist --project-name=heatwave-docs. - Reuse the idempotent
pages project create heatwave-docsstep and both existing Cloudflare secrets.
Cloudflare Pages + Access (Terraform / TFC)
Section titled “Cloudflare Pages + Access (Terraform / TFC)”- Domain:
docs.warmlyyours.dev— no extra Cloudflare licensing needed. A single-leveldocs.subdomain is covered by free Universal SSL (apex + first-level); Pages auto-provisions the cert; Access is free (account-level, ≤50 seats, not tied to the zone plan). Total TLS/ACM was only required for the two-level*.stg.warmlyyours.wsstaging names — not applicable here. Keeps docs off the production.comzone (blast-radius separation).docs.warmlyyours.comis a one-line fallback if we’d rather consolidate onto an already-managed zone. Pre-flight: (a) confirmwarmlyyours.devis active in the same CF account (it is — holds the stale dev-tunnel CNAMEs); (b) check CAA records onwarmlyyours.devdon’t block Cloudflare/Google Trust Services cert issuance. - Access: add a
cloudflare_zero_trust_access_application+cloudflare_zero_trust_access_policy(allow@warmlyyours.comemail domain or the existing staff Access group) for the docs hostname, via the Terraform/TFC workspace that already manages CF Access (the one usingCLOUDFLARE_ACCOUNT_API_TOKEN). No dashboard edits — matches our IaC posture (ad-hoc Access edits = drift). Gate the whole hostname,/apiincluded.
Search
Section titled “Search”Starlight ships Pagefind (client-side, built at compile, no SaaS) for its
own pages. To unify search across the YARD /api HTML too, run the pagefind
CLI over the combined dist/ after the copy step — validate this doesn’t
collide with Starlight’s built-in Pagefind index during Phase 4; if it does,
keep two search scopes (portal + YARD’s own).
Phasing
Section titled “Phasing”| Phase | Work | Done when |
|---|---|---|
| 0 — Scaffold ✅ | docs-site/ Astro 6.4 + Starlight 0.40 (isolated Yarn project), sync-docs.mjs over infrastructure+development | DONE 2026-06-14 — 48 pages build green, render with Pagefind search at localhost:4321. Gotcha fixed: Starlight 0.39+ removed top-level {label, autogenerate} → nest under items. |
| 1 — Content/IA ✅ | Manifest-driven curated IA (4-section spine), README home, frontmatter normalization, Kamal banner/IP fixes | DONE 2026-06-14 — 92 curated pages (from 654); obsolete + ephemera excluded; 5 living runbooks rescued. Remaining: intra-doc link remap + content merges |
| 2 — YARD combine + deploy ✅ | docs.yml builds YARD + portal, grafts /api, deploys to heatwave-docs; replaces disabled yard-docs.yml | DONE 2026-06-15 (PR #1144) — local: 92 pages + 4,577 YARD files served 200. YARD needs RUBY_THREAD_VM_STACK_SIZE=10000000 (SystemStackError otherwise). |
| 3 — Domain + Access | Pages custom domain docs.warmlyyours.ws, CF Access app/policy via Terraform | Gated hostname serves both |
| 4 — Search + previews | Unified Pagefind (validate), PR preview deploys, brand theme/logo | PR opens a gated preview URL; search spans portal (+/api if clean) |
| 5 — Curate/retire | Replace DOCUMENTATION_INDEX.md with nav; decide on tasks/skills exposure; fold ENV/runbook stragglers in | Single index of record is the site |
Curation outcome (Phase 1)
Section titled “Curation outcome (Phase 1)”The blind doc/ dump (654 md files) became a curated 92-page portal on the
narrative spine. The editorial source of truth is
docs-site/scripts/docs-manifest.mjs — add/move/retire docs there, not by
touching folders.
- Spine: home = README; The Framework (Architecture / Domain Model / Subsystems / Integrations / AI), Preparing (Toolchain / Workflow / Testing / External setup), Managing Production (Deploy / Database / Networking / Mail / Monitoring / Runbooks), Code Organization & Conventions (8 sub-groups).
- Rescued 5 living runbooks from
doc/tasks/into Production: PG18 failover, HAProxy routing, DB-tier HA, Valkey split, Databasus PITR. - Excluded ~30 obsolete (Redis ×4, Capistrano/Vultr host-ops ×8, stale JS
toolchain ×2, Cursor-era MCP ×3, completed reports, self-declared-dead) and
~193 ephemera (
tasks/,analysis/,sunny/,bugs/,reports/,prompts/,examples/,refactoring/). They stay in git, recoverable viagit log --diff-filter=D. - Kamal docs corrected (originals in
doc/infrastructure/kamal/): README + DEPLOYING banners now state prod-on-Kamal (cut over 2026-06-07); TROUBLESHOOTING tailnet IP fixed (100.112.243.87→100.68.157.49) and Chicago relabeled standby, not prod.
Follow-ups (not blocking deploy):
- Intra-doc relative links aren’t yet remapped to the new IA routes — some
in-page links 404. Needs a path-resolution pass in
sync-docs.mjs. - Agent-suggested content merges not applied (3 attachment docs → 1; TRACKING + ANALYTICS; 2 edge-cache docs).
- Deeper Kamal
README.mdarchitecture narrative + Vultr mermaid diagram still describe pre-cutover topology (flagged in-doc); needs a content refresh. - Strip README’s
docs.warmlyyours.devself-references once the site is live. - Optional (CodeRabbit): a CI/pre-commit drift check comparing config sources
(
deploy.yml/haproxy/terraform) IPs/ports/services againstINFRASTRUCTURE_INVENTORY.md. People-process for now. - Gate the
*.pages.devURLs:docs.warmlyyours.devis CF-Access gated, butheatwave-docs.pages.dev(production + branch previews) is still public. TF module written —infra/terraform/cloudflare-docs/(wy-employees policy + wildcard preview coverage); pendingtofu apply+ a preview-URL verify.
Effort
Section titled “Effort”Phases 0–3 (working gated site with API ref): ~1–2 focused days. Phases 4–5 (polish, unified search, curation): ~1 day. Inside the 15–40h range typical for an SSG docs stand-up.
Open variables
Section titled “Open variables”- Hostname —
docs.warmlyyours.devLOCKED (no licensing needed; see Cloudflare section).docs.warmlyyours.comavailable as a one-line fallback. doc/tasks/exposure — default: excluded. Surface as a collapsed ADR/decision-log section later? (opt-in)- Node toolchain in
docs-site/— pin via.mise.tomlordocs-site’s ownengines? Default: add a Node pin to.mise.toml.
Out of scope
Section titled “Out of scope”- Migrating
.agents/skills/into the portal (agent-targeted; stays in repo). - Authoring by non-engineers (locked: engineers-in-Git). If business/ops SOP authoring by non-coders is wanted later, that’s a separate hosted-WYSIWYG decision (GitBook / Mintlify web editor), not this stack.
- Retiring YARD (kept; it’s the only Ruby API-reference path).