Skip to content

Documentation Platform Adoption — Astro Starlight on Cloudflare Pages + Access

Date: 2026-06-14 Status: IN PROGRESS — Phases 0–2 DONE (PR #1144); Phases 3–5 (custom domain + CF Access, polish) pending Owner: cbillen

Adopt Astro Starlight as a single self-hosted docs portal for heatwave’s narrative/engineering/infra/ops documentation, built in CI, deployed to Cloudflare Pages, gated by Cloudflare Access. YARD stays for Ruby API reference and is published as a linked /api sub-site of the same Pages project. Authoring stays in Git/Markdown (engineers only).

ReasonDetail
Can’t truly self-hostThe “self-hosting” offering is a self-hosted Astro frontend only; the content engine, search, AI, and build pipeline stay on Mintlify’s servers, your source is processed by them, and it is not air-gapped. Enterprise-only. (custom-frontends blog)
Won’t replace YARDAPI reference is OpenAPI/AsyncAPI only — no Ruby source parsing. Our ~6,800 YARD annotations across 3,656 files have no path in. (OpenAPI setup) So it can’t be the single platform anyway.
Posture mismatchMost docs are sensitive internal runbooks (DR steps, tailnet IPs, credential locations). Private/SSO docs are Enterprise (contact-sales), and content still lives on Mintlify. We gate Netdata/PgHero/etc. behind tailnet + CF Access — a SaaS docs host is a regression. (pricing)

Mintlify remains the right tool if we ever ship a public developer API product. We don’t have one today; revisit then.

Current state (what we’re replacing/extending)

Section titled “Current state (what we’re replacing/extending)”
  • YARD pipeline is DISABLED. .github/workflows/yard-docs.yml.disabled builds YARD → deploys doc/yard to Cloudflare Pages project heatwave-docs via cloudflare/wrangler-action@v3. Re-enable + extend it; don’t rebuild it.
  • 654 markdown files are unpublished. 341 in doc/, 313 in .agents/skills/. Only the handful named in .yardopts are pulled into YARD. The win is publishing the curated doc/ set as a real browsable site.
  • Secrets already wired in CI: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID, BUNDLE_GEMS__CONTRIBSYS__COM.
GitHub Actions (push to master; PR → CF Pages preview)
├─ Ruby/YARD → bundle exec yard doc → doc/yard/
├─ Node/Astro → sync doc/** → src/content/docs → docs-site/dist/
│ starlight build (Pagefind search)
├─ combine cp -r doc/yard → docs-site/dist/api
└─ wrangler pages deploy docs-site/dist --project-name=heatwave-docs
Cloudflare Pages (heatwave-docs) → docs.warmlyyours.dev (custom domain)
└─ Cloudflare Access policy (Terraform) gates the whole hostname
Starlight portal at / (narrative/infra/ops docs)
YARD reference at /api/ (Ruby class/method docs)

One project, one domain, one Access policy. Cohesive.

A nested, isolated Astro project — its own package.json, untouched by the root webpack/Yarn build:

docs-site/
├─ package.json # astro + @astrojs/starlight (isolated from root)
├─ astro.config.mjs # starlight() integration, site, sidebar
├─ src/
│ ├─ content.config.ts # docs collection (docsSchema)
│ └─ content/docs/ # GENERATED at build by sync script (gitignored)
├─ scripts/sync-docs.mjs # doc/** → src/content/docs, normalize frontmatter
└─ public/ # logo, favicon, brand css

doc/ stays the source of truth (agents, .yardopts, and in-repo relative links keep working). The site is a build artifact.

Build-time sync + normalize (primary approach — robust regardless of loader-API nuance, and required anyway because Starlight’s docsSchema() mandates a title frontmatter field our .md files mostly lack):

scripts/sync-docs.mjs does, per file in the curated set:

  1. Copy doc/<dir>/**/*.{md,MD}docs-site/src/content/docs/<dir>/.
  2. If frontmatter has no title, inject one derived from the first # H1 (fallback: filename → Title Case).
  3. Rewrite intra-doc relative links (../infrastructure/x.md/infrastructure/x).
  4. Leave excluded dirs out entirely.

Curation = reuse the .yardopts split (already battle-tested):

  • Include (10 dirs): architecture, deployment, development, features, frontend, infrastructure, integrations, monitoring, operations, troubleshooting + root README.md, CONVENTIONS.md, DESIGN.crm.md, DESIGN.www.md.
  • Exclude (working/ephemeral): tasks, specs, analysis, sunny, refactoring, prompts, reports, bugs, legacy, and the 19 legacy .html in doc/tasks/.

Explicitly NOT in the portal (different audience):

  • .agents/skills/ (313 files) — agent-targeted instructions; the agents read them from the repo. Don’t dump into a human portal. (A curated “How our skills work” overview page can link to the tree, but the skills themselves stay.)
  • doc/tasks/ (174 files) — a decision-log archive, not reference. Optionally surface later as a separate, collapsed “Decision log / ADR” section if wanted.

Map the 10 included dirs to top-level sidebar groups; let Starlight autogenerate within each from the file tree. Replace the hand-maintained doc/DOCUMENTATION_INDEX.md with the site nav (keep the file as a redirect stub or landing page). Draft top-level order:

Overview · Architecture · Development · Frontend · Features · Infrastructure · Deployment · Operations · Monitoring · Integrations · Troubleshooting · API (YARD ↗)

  • Keep .yardopts and yard-templates/ as-is; keep state_machines-yard, yard-activerecord, yard-activesupport-concern plugins.
  • After Astro build: cp -r doc/yard docs-site/dist/api.
  • Sidebar gets an external link to /api/ (YARD keeps its own in-page search).
  • Re-enabling the build also re-lights yard-lint hygiene on touched files.

New .github/workflows/docs.yml (rename/replace yard-docs.yml.disabled):

  • Triggers: push: [master]; pull_request for CF Pages preview deploys (review doc changes on a real gated URL before merge).
  • Steps: actions/checkout@v6ruby/setup-ruby@v1 (bundler-cache, same BUNDLE_GEMS__CONTRIBSYS__COM) → yard doc → Node setup → yarn --cwd docs-site installnode docs-site/scripts/sync-docs.mjsyarn --cwd docs-site buildcp -r doc/yard docs-site/dist/apiwrangler-action@v3 pages deploy docs-site/dist --project-name=heatwave-docs.
  • Reuse the idempotent pages project create heatwave-docs step and both existing Cloudflare secrets.

Cloudflare Pages + Access (Terraform / TFC)

Section titled “Cloudflare Pages + Access (Terraform / TFC)”
  • Domain: docs.warmlyyours.devno extra Cloudflare licensing needed. A single-level docs. subdomain is covered by free Universal SSL (apex + first-level); Pages auto-provisions the cert; Access is free (account-level, ≤50 seats, not tied to the zone plan). Total TLS/ACM was only required for the two-level *.stg.warmlyyours.ws staging names — not applicable here. Keeps docs off the production .com zone (blast-radius separation). docs.warmlyyours.com is a one-line fallback if we’d rather consolidate onto an already-managed zone. Pre-flight: (a) confirm warmlyyours.dev is active in the same CF account (it is — holds the stale dev-tunnel CNAMEs); (b) check CAA records on warmlyyours.dev don’t block Cloudflare/Google Trust Services cert issuance.
  • Access: add a cloudflare_zero_trust_access_application + cloudflare_zero_trust_access_policy (allow @warmlyyours.com email domain or the existing staff Access group) for the docs hostname, via the Terraform/TFC workspace that already manages CF Access (the one using CLOUDFLARE_ACCOUNT_API_TOKEN). No dashboard edits — matches our IaC posture (ad-hoc Access edits = drift). Gate the whole hostname, /api included.

Starlight ships Pagefind (client-side, built at compile, no SaaS) for its own pages. To unify search across the YARD /api HTML too, run the pagefind CLI over the combined dist/ after the copy step — validate this doesn’t collide with Starlight’s built-in Pagefind index during Phase 4; if it does, keep two search scopes (portal + YARD’s own).

PhaseWorkDone when
0 — Scaffolddocs-site/ Astro 6.4 + Starlight 0.40 (isolated Yarn project), sync-docs.mjs over infrastructure+developmentDONE 2026-06-14 — 48 pages build green, render with Pagefind search at localhost:4321. Gotcha fixed: Starlight 0.39+ removed top-level {label, autogenerate} → nest under items.
1 — Content/IAManifest-driven curated IA (4-section spine), README home, frontmatter normalization, Kamal banner/IP fixesDONE 2026-06-14 — 92 curated pages (from 654); obsolete + ephemera excluded; 5 living runbooks rescued. Remaining: intra-doc link remap + content merges
2 — YARD combine + deploydocs.yml builds YARD + portal, grafts /api, deploys to heatwave-docs; replaces disabled yard-docs.ymlDONE 2026-06-15 (PR #1144) — local: 92 pages + 4,577 YARD files served 200. YARD needs RUBY_THREAD_VM_STACK_SIZE=10000000 (SystemStackError otherwise).
3 — Domain + AccessPages custom domain docs.warmlyyours.ws, CF Access app/policy via TerraformGated hostname serves both
4 — Search + previewsUnified Pagefind (validate), PR preview deploys, brand theme/logoPR opens a gated preview URL; search spans portal (+/api if clean)
5 — Curate/retireReplace DOCUMENTATION_INDEX.md with nav; decide on tasks/skills exposure; fold ENV/runbook stragglers inSingle index of record is the site

The blind doc/ dump (654 md files) became a curated 92-page portal on the narrative spine. The editorial source of truth is docs-site/scripts/docs-manifest.mjs — add/move/retire docs there, not by touching folders.

  • Spine: home = README; The Framework (Architecture / Domain Model / Subsystems / Integrations / AI), Preparing (Toolchain / Workflow / Testing / External setup), Managing Production (Deploy / Database / Networking / Mail / Monitoring / Runbooks), Code Organization & Conventions (8 sub-groups).
  • Rescued 5 living runbooks from doc/tasks/ into Production: PG18 failover, HAProxy routing, DB-tier HA, Valkey split, Databasus PITR.
  • Excluded ~30 obsolete (Redis ×4, Capistrano/Vultr host-ops ×8, stale JS toolchain ×2, Cursor-era MCP ×3, completed reports, self-declared-dead) and ~193 ephemera (tasks/, analysis/, sunny/, bugs/, reports/, prompts/, examples/, refactoring/). They stay in git, recoverable via git log --diff-filter=D.
  • Kamal docs corrected (originals in doc/infrastructure/kamal/): README + DEPLOYING banners now state prod-on-Kamal (cut over 2026-06-07); TROUBLESHOOTING tailnet IP fixed (100.112.243.87100.68.157.49) and Chicago relabeled standby, not prod.

Follow-ups (not blocking deploy):

  • Intra-doc relative links aren’t yet remapped to the new IA routes — some in-page links 404. Needs a path-resolution pass in sync-docs.mjs.
  • Agent-suggested content merges not applied (3 attachment docs → 1; TRACKING + ANALYTICS; 2 edge-cache docs).
  • Deeper Kamal README.md architecture narrative + Vultr mermaid diagram still describe pre-cutover topology (flagged in-doc); needs a content refresh.
  • Strip README’s docs.warmlyyours.dev self-references once the site is live.
  • Optional (CodeRabbit): a CI/pre-commit drift check comparing config sources (deploy.yml / haproxy / terraform) IPs/ports/services against INFRASTRUCTURE_INVENTORY.md. People-process for now.
  • Gate the *.pages.dev URLs: docs.warmlyyours.dev is CF-Access gated, but heatwave-docs.pages.dev (production + branch previews) is still public. TF module written — infra/terraform/cloudflare-docs/ (wy-employees policy + wildcard preview coverage); pending tofu apply + a preview-URL verify.

Phases 0–3 (working gated site with API ref): ~1–2 focused days. Phases 4–5 (polish, unified search, curation): ~1 day. Inside the 15–40h range typical for an SSG docs stand-up.

  1. Hostnamedocs.warmlyyours.dev LOCKED (no licensing needed; see Cloudflare section). docs.warmlyyours.com available as a one-line fallback.
  2. doc/tasks/ exposure — default: excluded. Surface as a collapsed ADR/decision-log section later? (opt-in)
  3. Node toolchain in docs-site/ — pin via .mise.toml or docs-site’s own engines? Default: add a Node pin to .mise.toml.
  • Migrating .agents/skills/ into the portal (agent-targeted; stays in repo).
  • Authoring by non-engineers (locked: engineers-in-Git). If business/ops SOP authoring by non-coders is wanted later, that’s a separate hosted-WYSIWYG decision (GitBook / Mintlify web editor), not this stack.
  • Retiring YARD (kept; it’s the only Ruby API-reference path).