Don’t Wait for the Next Outage: A CTO’s GitHub Exit Strategy

By Diogo Hudson Dias
CTO and platform engineer in a São Paulo office running a GitHub failover drill with terminal and Forgejo dashboard on dual monitors.

You wouldn’t run production in a single availability zone. Don’t run your engineering org in a single code forge.

In the last few weeks, we’ve seen a drumbeat of reminders that GitHub is a dependency, not a guarantee: a remote code execution CVE disclosure, a public availability incident update, the Ghostty project publicly announcing it’s leaving GitHub, and HardenedBSD standing up on Radicle. None of this means “panic and migrate tomorrow.” It means act like a sober platform owner: hedge your risk now so you can migrate later—without drama.

This post lays out a decision framework and a concrete architecture for a dual-home Git strategy: keep GitHub as your primary, stand up a second home you can cut over to in hours, not weeks. You’ll also get a 30-60-90 day plan and the pitfalls we’ve seen up close delivering this for US teams with our Brazil-based platform engineers.

Why a GitHub hedge is rational (not reactionary)

  • Single-forge dependency. If your repos, CI, packages, and code search all sit behind one vendor, a single provider-side auth outage can stall engineering in minutes. You measure this in blended eng cost per hour, not in SLA decimals.
  • Security events happen. RCEs and token-leak bugs will continue to occur across all forges. The mitigation isn’t “switch vendors.” It’s “assume breach, contain blast radius, fail over quickly.”
  • Policy and pricing shift. ToS or pricing changes (think usage-based AI add-ons or seat inflation) can hit budget mid-year. You need leverage before renewal, not after.
  • Compliance and sovereignty. Certain customers (especially enterprise or regulated) increasingly require demonstrable code provenance and a vendor-exit plan. A mirror with signed builds satisfies auditors who ask, “What if GitHub is unavailable for 48 hours?”

Staying on GitHub is fine. Staying only on GitHub, with no tested escape hatch, is not.

The decision framework: three levels of readiness

Level 0: Status quo (don’t stay here)

  • All repos, issues, CI/CD, packages on GitHub.
  • Personal access tokens for automation.
  • No offsite backups, no commit signing, no provenance.

Time to exit: days to weeks, with painful gaps and data loss.

Level 1: Hedge (start here, 0–30 days)

  • Nightly bare-repo backups offsite (encrypted), including Git LFS.
  • Keyless signing for builds via Sigstore (Fulcio/Rekor) and provenance metadata (SLSA L2+).
  • GitHub Actions use OIDC to cloud for credentials; no long-lived secrets.
  • Package artifacts mirrored to a neutral registry (e.g., ECR/Artifact Registry/Quay), not just GitHub Packages.

Time to exit: 48–72 hours with overtime and some manual rework.

Level 2: Dual-home (what you actually want, 30–90 days)

  • GitHub remains primary. Forgejo/Gitea/GitLab CE stands up as a hot mirror (read-only by policy).
  • All repos push-mirrored on commit; issues/discussions mirrored on a schedule.
  • CI/CD definitions are runner-neutral with a shadow pipeline on WoodpeckerCI/Drone, Buildkite, or Tekton.
  • Sourcegraph/Zoekt indexes both forges.
  • Quarterly failover game day and documented cutover steps.

Time to exit: 4–8 hours, mostly coordinated DNS/URLs and pipeline flips.

Level 3: Full migration (only if necessary)

  • Primary moves to self-hosted Forgejo/GitLab CE or a hosted alternative (SourceHut, Codeberg, GitLab Cloud).
  • GitHub remains read-only mirror for history continuity and ecosystem reach.

Time to complete: 2–6 weeks with careful issue/PR history treatment.

The reference architecture: Git that fails over like prod

Forges and mirrors

  • Primary: GitHub (Enterprise recommended for SSO, audit, and IP allow lists).
  • Secondary: Forgejo (community-led Gitea fork), Gitea, or GitLab CE. For the very adventurous or OSS use cases, add a Radicle remote for peer-to-peer replication.

Mirroring pattern:

  1. Protect your primary branch on GitHub; only CI or merge queues can write.
  2. On push to protected branches, run a push-mirror job from a hardened runner that maintains a write-only deploy key to the secondary forge. Use git push --mirror for new repos and git push --prune for ongoing sync.
  3. Exclude secrets and private Git notes. Explicitly sync Git LFS via git lfs fetch --all and git lfs push --all.

Costs: A 4 vCPU/16 GB VM for Forgejo plus Postgres typically runs $60–$120/month on a major cloud, plus S3-compatible storage for repos/LFS (roughly $0.023/GB-month). Admin time is the real cost: expect 3–5 hours/week once stable for orgs with 50 engineers and 200–400 repos.

Identity and permissions

  • Centralize on SSO (Okta/Entra/Google) for both forges.
  • Use short-lived tokens via OIDC for all automation; ban personal access tokens from CI.
  • Replicate team→repo permissions to the secondary via an exporter (GitHub GraphQL) and importer (Forgejo/GitLab API). Generate a nightly diff and alert on drift.

CI/CD neutrality

  • Keep GitHub Actions for developer ergonomics, but define pipelines in a portable way:
  • For build/test: Mirror the job graph in WoodpeckerCI or Drone (YAML is close to Actions), or use Buildkite if you want a managed control plane with your own runners.
  • For K8s-native orgs: Tekton gives you full control, but expect more plumbing.
  • Credentials: Use OIDC to cloud (AWS/GCP/Azure) from both CI systems so secrets are identical. No environment-long secrets; every step gets fresh, scoped creds.
  • Artifact provenance: Sign containers and binaries with Sigstore, attach SLSA provenance to releases, and verify in deploy. This makes “which forge built this?” a non-question.

Issues, PRs, and code review

This is the hardest piece to make portable. Choose a single system of record for collaboration while mirroring metadata for continuity.

  • System of record: Keep issues/PRs on GitHub while you’re dual-homing. In failover, freeze GitHub (org-level write permissions off) and promote secondary to read-write.
  • Mirroring: Use exporters to sync issues/labels/milestones to secondary nightly. For Forgejo/GitLab, the migration tools can import most metadata; reviews and PR discussion often lose fidelity. Snapshot PRs to static HTML on S3 for legal/archival.
  • Notifications: Route webhooks through a fan-out relay (e.g., a tiny service behind API Gateway/Cloud Run) that forwards to both CI systems and chat, so flipping systems doesn’t break automations.

Search and developer experience

  • Code search: Point Sourcegraph (self-hosted or cloud) or Zoekt at both forges. Devs keep one search bar, no matter which remote is primary.
  • Devcontainers and templates: Keep them in-repo and provider-agnostic. Avoid Actions-only composite steps for foundational operations; wrap them in makefiles or small Go/Node CLIs your alternate CI can call.

Cutover: what “four hours” really looks like

If you’ve done Level 2 right, here’s your runbook for a planned or emergency cutover.

  1. Freeze writes on GitHub. Flip org/repo permissions to read-only. Announce a 30-minute brownout.
  2. Promote secondary to read-write. Remove write-protection policies temporarily where needed; preserve branch protections.
  3. Flip CI/CD. Pause Actions org-wide, enable pipelines on the secondary (Woodpecker/Buildkite/Tekton). Verify runners are available and autoscaling.
  4. Update remotes in bulk. For monorepos and trunk-based teams, update origin URLs via centralized scripts for internal repos; OSS repos keep GitHub as a mirror for the community.
  5. Repoint webhooks and package publishing. Your relay should make this a toggle, not a refactor. Validate artifact signatures in staging, then production.
  6. Monitor and communicate. Track merge queue times, CI duration, incident channels. Decide within 24 hours whether to stay or roll back.

The most common friction: developers with long-lived feature branches that weren’t mirrored, and oddball pipelines that used deprecated Actions. A dry-run on a pilot team flushes these.

Numbers that matter when you pitch this to the CFO

  • Seat costs don’t go away. Dual-home hedging isn’t primarily about saving seats. GitHub Enterprise per-user pricing still makes sense for most teams. The hedge is an insurance premium, not a replacement.
  • Infra line items are modest. Expect $150–$400/month for the secondary forge infra at 50–100 engineers, plus $50–$150 for a small Sourcegraph/Zoekt instance if you self-host.
  • Time to value is short. Teams we’ve helped reach Level 2 in 6–10 weeks with one platform engineer ~50% allocated and a DevOps/SRE ~30% allocated. Game days take 2–3 hours quarterly.
  • The outage math is brutal. If your blended engineer cost is $150/hour and 60 engineers are blocked for 3 hours, that’s $27,000. One incident pays for years of hedging infra.

Pitfalls and how to avoid them

  • Git LFS surprises. Mirror LFS explicitly and verify object counts match on both ends. Many teams assume --mirror includes LFS; it doesn’t.
  • Review history loss. You won’t 1:1 port PR review threads. Archive them (static HTML + S3 lifecycle) and move on. Keep GitHub read-only for a quarter so links still work.
  • Secrets sprawl. Hedging doubles places secrets can leak unless you standardize on OIDC and ephemeral credentials. Don’t replicate old sins.
  • Shadow tools drift. Keep a weekly diff of repos, teams, and branch protections between forges. Fail fast on drift rather than discover it at cutover.
  • Over-abstracting CI. Aim for 80% portability, not a “least common denominator” DSL that slows the team. It’s okay if 20% of jobs are provider-specific.

What about Radicle and fully decentralized code?

Radicle (which HardenedBSD now uses) is compelling for censorship resistance and peer-to-peer replication. Today, it’s a complement, not a replacement, for most commercial teams:

  • Use cases: OSS mirrors, emergency replication to developer machines, and extra assurance for core repos.
  • Gaps: Enterprise SSO/permissions maturity and PR/review UX compared to GitHub/GitLab.

If you’re OSS-heavy or have unique sovereignty constraints, adding a Radicle remote is cheap insurance. Treat it as a third copy of your most critical repos.

A 30–60–90 day plan you can actually run

Days 0–30: Fast defenses

  • Inventory all repos, LFS use, Actions, runners, secrets, webhooks, and packages.
  • Turn on Sigstore signing in builds and attach SLSA provenance to artifacts.
  • Replace CI secrets with OIDC. Ban personal access tokens in automation.
  • Stand up nightly bare-repo + LFS backups to S3 with lifecycle rules.
  • Move artifact publishing to a neutral registry and mirror from there to GitHub Packages (not vice versa).

Days 31–60: Build the second home

  • Stand up Forgejo/GitLab CE with SSO and per-team permissions.
  • Implement push-mirror on commit for all protected branches; backfill large repos offline.
  • Bring up shadow CI (Woodpecker/Buildkite/Tekton). Mirror 70–80% of Actions jobs.
  • Index both forges in Sourcegraph/Zoekt.
  • Mirror issues/labels/milestones nightly; archive PR histories weekly to S3.
  • Pilot a game day with one product squad and a non-critical service.

Days 61–90: Prove the failover

  • Run a company-wide 2-hour failover game day. Measure time-to-merge and CI stability.
  • Publish a cutover runbook with crisp owner names and SLAs.
  • Automate drift detection (teams, repos, branch protections) and alerting.
  • Integrate the plan into vendor risk and BCP documents for audits and enterprise customers.

Where a nearshore partner helps (and where you don’t need one)

You don’t need help to turn on Sigstore and backups. You might want help when:

  • You have 200+ repos, multiple languages, and heterogeneous CI runners.
  • You must preserve SOC 2 evidence trails while changing CI/CD and artifact flows.
  • You need 6–8 hours/day overlap to run migration windows without burning nights and weekends.

An experienced platform team will front-load the gnarly parts: LFS syncing, identity/SSO mapping, provenance verification in deploy, and dry-run failovers. After that, steady-state ops is light.

Bottom line

Ghostty leaving GitHub, Radicle’s momentum with hardened OS projects, and GitHub’s own incident reports should sharpen your thinking—not provoke a knee-jerk migration. The right move is level-headed resilience: keep using GitHub where it shines, but give yourself a real Plan B. You already invest in multi-AZ, multi-region, and runbooks for prod. Your code platform deserves the same seriousness.

Key Takeaways

  • Don’t rage-quit GitHub—hedge it. Dual-home your code, CI, and artifacts so a cutover is hours, not weeks.
  • Provenance beats platform. Sign builds, attach SLSA, and use OIDC so ownership and integrity don’t depend on any one forge.
  • Expect $150–$400/month in infra for a secondary forge at 50–100 engineers; the first outage pays for the hedge.
  • The hardest part is issues/PRs. Mirror what you can, archive the rest, and keep GitHub read-only for continuity.
  • Run a failover game day quarterly. If you’ve never tested your exit, you don’t have one.

Ready to scale your engineering team?

Tell us about your project and we'll get back to you within 24 hours.

Start a conversation