You don’t need a whiteboard full of lattice math to make a correct decision on post‑quantum cryptography (PQC). You need a plan. PQC has crossed from research into your toolchain: GnuPG is landing post‑quantum support in mainline, OpenSSH has shipped a hybrid PQ key exchange for years, and the big clouds and CDNs have run real‑world hybrid TLS experiments. Attackers are already harvesting encrypted traffic today to decrypt later. If your data needs to stay confidential for 5–10+ years, you are on the clock.
This isn’t FUD. It’s a sequencing problem. If you wait for every HSM, smartcard, and compliance framework to “fully” support PQC, you’ll miss your window and pay twice: once for a rushed migration, and again for the production incidents that go with it. The right play is to make your stack crypto‑agile now, roll out hybrid key exchange and dual signatures where they help, and avoid breaking the systems that still live in a FIPS validation queue.
What changed in 2026
- Algorithms are real, not draft toys. NIST’s round wrapped with selections for key encapsulation (ML‑KEM, derived from Kyber) and digital signatures (ML‑DSA from Dilithium; SLH‑DSA from SPHINCS+ for special cases). FIPS publications and profiles are landing across 2024–2026.
- PQC is in mainstream tooling. GnuPG’s post‑quantum landing signals PGP/S/MIME experiments are ready outside of academic builds. OpenSSH ships a hybrid KEX (sntrup761x25519) deployed by major providers. Cloudflare, Google, and others have run wide hybrid TLS 1.3 trials (e.g., X25519+Kyber768).
- Harvest‑now‑decrypt‑later (HNDL) is the real risk. AES‑256 survives Grover’s quadratic speedup; your symmetric crypto is likely fine. Your key establishment and signatures are not. If a session’s confidentiality must last a decade, you should not rely on RSA/ECDH alone in 2026.
The two decisions you actually need to make
- Where is hybrid PQ worth the operational pain today? Your public‑facing TLS endpoints and administrative SSH are the low‑regret moves. They deliver immediate HNDL protection without fully re‑platforming identity.
- How do you buy time everywhere else? Make your software crypto‑agile so you can swap algorithms without app rewrites. Then phase in PQ when your vendors, HSMs, and auditors catch up.
A 12‑month plan you can execute
Phase 0 (Weeks 0–2): Establish the rule of the game
- Define “confidentiality lifetime” by data class. Example buckets: 0–2 years (transient logs), 3–7 years (user PII/PHI), 8–20 years (financials, government contracts, genomics). Anything ≥7 years goes into the PQ priority lane.
- Appoint a DRI squad: 1 Security Engineer, 1 SRE, 1 Platform Engineer. Add a TPM if you have high vendor surface. Give them a weekly executive checkpoint and a 12‑month OKR.
Phase 1 (Weeks 2–8): Build your Crypto Bill of Materials (CBOM)
- Inventory by evidence, not by vibe. Use testssl.sh or ztls to scan external TLS endpoints and record supported KEX/sig suites. For internal mTLS, add Envoy/HAProxy telemetry of negotiated ciphers. For SSH, parse sshd_config fleetwide and capture server KEX lists.
- Code search your repos for crypto libraries (OpenSSL, BoringSSL, BouncyCastle, libsodium, JCE providers). Note versions. Flag brittle hand‑rolled crypto and pinned cipher strings.
- Catalog keys and certs: CA chains, leaf certs, JWT signing keys, code‑signing certs, KMS keys wrapping DEKs/KEKs, PGP keys. Record curves, RSA sizes, and rotation cadences.
- Map HSM/KMS capabilities and vendor roadmaps. Ask pointed questions: ML‑KEM/ML‑DSA support timelines, hybrid TLS support, FIPS status, firmware upgrade paths, throughput impacts, and costs.
Phase 2 (Weeks 8–16): Make the platform crypto‑agile
- Abstract crypto once. If your services call OpenSSL directly, introduce a thin provider layer with algorithm negotiation. In JVM land, configure via JCE providers and property flags rather than hardcoding suites. In Go, prefer tls.Config with dynamic CurvePreferences and cipher suites loaded from config.
- Remove brittle pins. Replace explicit “TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256”‑style pins with policy names (e.g., “prod‑tls‑policy‑2026”) resolved at startup from config. That one indirection is the difference between a config flip and a redeploy.
- Enable dual‑sign support in pipelines. For artifacts, ensure your signing workflow can attach both a classical and a PQ signature (e.g., cosign multi‑signature, JOSE/COSE structures, or detached metadata). You may not turn PQ on today, but you’ll be able to when your trust stores allow it.
Phase 3 (Months 4–6): Ship hybrid where it pays now
- TLS for public endpoints: Stand up a canary domain using X25519+Kyber768 hybrid KEX via a CDN or edge proxy that supports it. Expect handshake bytes to grow by ~1–2 KB and a negligible CPU bump on modern CPUs. Target 10–20% of external traffic for 2–4 weeks, track handshake failure rates and client mix. If middleboxes misbehave, fall back per SNI while keeping hybrid for modern clients.
- SSH for admin paths: Enable sntrup761x25519-sha512@openssh.com in OpenSSH for staff access. It’s drop‑in, tested, and gives you PQ protection on the credentials that matter most: the ones with shell on prod.
- Data at rest: Confirm AES‑256‑GCM everywhere. If any service still uses AES‑128 or 3DES, fix it now. For key wrapping (KEKs protecting DEKs), prefer symmetric local wrapping or KMS that publicly commits to PQ KEM support; avoid long‑lived RSA‑wrapped secrets for high‑lifetime data.
Phase 4 (Months 6–9): Extend to identity and messaging
- JWT and service identity: Keep RS256/ES256 for interoperability, but add PQ‑ready plumbing. If you operate your own CA, pilot an internal CA that issues dual‑signed certs (classical + PQ signature) for non‑browser clients that you control. For SPIFFE/SPIRE or mTLS service mesh, test PQ KEM in staging channels where supported.
- Email and file encryption: With GnuPG’s PQ support landing, create a parallel keyring for sensitive roles (legal, finance, healthcare ops) using hybrid or PQ‑only certs. Run a closed‑group pilot: 20–50 staff, opt‑in, with fallbacks. Expect message sizes to grow (Dilithium signatures ~2–3 KB; SPHINCS+ even larger). Document client compatibility.
- Code signing and supply chain: Most OS trust stores don’t accept PQ signatures yet. The move today is dual signing: keep your classical signature for trust stores, and attach a PQ signature as an additional attestation recorded in provenance (SLSA, GUAC). This lets you verify PQ internally and flip externally when stores catch up.
Phase 5 (Months 9–12): Bake it into operations and procurement
- Observability: Add TLS handshake size and failure‑by‑group metrics to your dashboards. Track “hybrid negotiated” as a ratio. Set an SLO: less than 0.1% handshake failures attributable to KEX negotiation during rollouts.
- Rotation policy: Shorten certificate and key lifetimes (e.g., 30–90 days) to make crypto shifts low‑risk. Practically, you’ll run more rotations, but with automation this is cheap insurance for rapid algorithm switches.
- Vendor gating: Update your security questionnaire: “List supported KEMs and signature schemes; confirm ability to negotiate ML‑KEM/ML‑DSA; provide FIPS status and roadmap.” Refuse black boxes that hardcode ciphers.
- Incident runbooks: Document downgrade/fallback behavior so an ops engineer doesn’t disable hybrid site‑wide under pressure. The right response to a middlebox failure is a targeted SNI or IP‑range policy, not a global rollback.
What this costs (and what it saves)
- People/time: A 3–4 person squad can complete a credible CBOM in 6–8 weeks, add crypto‑agility abstraction in 4–6 weeks for the top 10 services, and pilot hybrid TLS/SSH in another 4–6 weeks. That’s ~3–4 calendar months to start realizing protection where it counts.
- Infra overhead: Hybrid TLS handshakes add roughly 1–2 KB and minor CPU (ML‑KEM decapsulation is fast on modern CPUs). Expect a low single‑digit percent increase in handshake CPU at the edge. For most SaaS traffic shapes, this is in the noise.
- Vendor spend: The expensive part is HSM and KMS upgrades if you require PQ in hardware. If you can tolerate software‑terminating hybrid TLS at the edge for the next 12–24 months, you defer major capex while the HSM market matures.
Compatibility and performance: the trade‑offs you should accept
- Bigger signatures and keys. Approximate sizes matter for planning: ML‑KEM‑768 public keys ~1.1 KB; ciphertexts ~1.1 KB. ML‑DSA signatures are ~2–3 KB. SPHINCS+ signatures can be 8–30 KB. This affects email, logging, and bandwidth accounting.
- Middleboxes will surprise you. Some corporate proxies and IDS drop unknown TLS groups. That’s why you start with a canary domain and SNI‑based policies. Don’t flip hybrid on your primary login hostname on day one.
- FIPS and audits lag. Regulated environments need FIPS‑validated modules. Fine—run PQ in controlled pilots and crypto‑agile code paths now, then attach a validated module when it lands. Waiting for the stamp before writing a line of code is how you end up doing heroics in Q4.
Brazil and LatAm realities you should factor
- Trust anchors: If you serve Brazilian customers, monitor ICP‑Brasil guidance for digital signatures used in e‑invoicing and public tenders. Don’t preempt the root program; instead, dual‑sign artifacts and internal attestations so you’re ready when PQ profiles appear.
- Payments and fintech: PIX/Open Finance providers care about latency budgets. A 1–2 KB handshake increase is typically within tolerance if you terminate at edge PoPs in São Paulo and Rio. Measure p95 handshake time before and after; budget 5–10 ms extra on congested links.
- Nearshore execution: A Brazil‑based squad with 6–8 hours of US overlap can run your CBOM, crypto‑agility abstraction, and traffic canaries without timezone pain. This isn’t exotic R&D; it’s strong platform engineering with disciplined change control.
What not to do
- Don’t “boil the ocean.” You don’t need PQ in every microservice by December. Protect public ingress and root‑of‑trust paths first.
- Don’t lock into a dead‑end appliance. If a vendor can’t name their supported KEM/sig schemes and show a working hybrid demo, they’re not ready—no matter how glossy the datasheet.
- Don’t conflate crypto agility with chaos. Centralize cipher policy in one place, roll with canaries, and observe. You want reversible moves with tight blast radii.
A simple decision framework
Use this to prioritize where PQ goes first:
- Does the data need confidentiality ≥7 years? Yes → prioritize PQ hybrid on transport and key wrapping. No → keep classical for now; ensure crypto‑agile code paths.
- Do you fully control both ends of the connection? Yes → move faster (mTLS, service mesh, SSH). No → use canaries on public TLS and measure client breakage.
- Is the path a root of trust? Code signing, CI/CD provenance, admin access → implement dual signatures or hybrid KEX early regardless of client mix.
Answering the executive questions you’ll get
- “Are we exposed today?” If you run RSA/ECDHE‑only TLS for traffic whose confidentiality must last a decade, yes—HNDL is a real risk. You can reduce it materially within a quarter by adding hybrid KEX at the edge.
- “Will customers notice?” Almost certainly not. Handshake bytes and CPU go up a bit; tail latency moves by single‑digit milliseconds in worst cases. That’s far below your typical variance from network weather.
- “What if the algorithms change?” That’s why you implement crypto agility. You’re not tattooing Kyber on your forehead; you’re making it a config knob.
Why now, not later
Waiting doesn’t buy you certainty; it buys you debt. The standards are sufficiently baked, mainstream tools are shipping, and the cost to pilot is low. The price of doing nothing is shipping more long‑lived secrets under classical crypto while adversaries record your traffic. You can knock out 80% of the benefit with 20% of the effort by focusing on TLS ingress, SSH admin, and crypto‑agile software. Everything else can mature behind that beachhead.
Key Takeaways
- Start with a CBOM in 6–8 weeks and make your code crypto‑agile; that’s the unlock for everything else.
- Ship hybrid TLS (X25519+Kyber768) on public ingress and hybrid SSH for admin; measure, canary, and keep fallbacks per SNI.
- Use AES‑256‑GCM for data at rest and avoid long‑lived RSA‑wrapped KEKs for high‑lifetime data.
- Dual‑sign artifacts and provenance now so you can flip external trust when stores adopt PQ.
- Expect small performance costs (KBs and milliseconds), bigger signatures, and occasional middlebox issues—plan rollouts accordingly.
- Treat PQ as a product capability: observability, rotation, procurement language, and runbooks—not a one‑off crypto upgrade.