2026-05-21 · 10 min read

Make Static Analysis Boring: A SARIF‑First Pipeline for Polyglot Teams

By Diogo Hudson Dias

Two engineers review a code scan dashboard on a large monitor in a modern São Paulo office at dusk.

If you still have five scanners and three dashboards arguing about the same null deref, you don’t have a security program — you have a noise machine. With GCC 16 adding native SARIF output and GitHub/GitLab leaning hard into SARIF as the lingua franca, you’re out of excuses. Standardize on SARIF, gate only on new issues, and make static analysis boring again.

Why this, why now

Two headlines collide this month: “New features in GCC 16: Improved error messages and SARIF output” and “GitHub confirms breach of 3,800 repos via a malicious VS Code extension.” The message for CTOs is blunt: the tooling surface area is exploding, while attackers are happy to ride your developer pipeline into prod. Unifying static analysis isn’t a nice-to-have — it’s how you keep signal visible when the blast radius grows.

Most startups we meet run a familiar anti‑pattern:

JavaScript uses ESLint, TypeScript types, and a secret DAST job somewhere.
Python runs Bandit and Pylint, each posting their own comments.
Go uses go vet and Gosec; C++ teams use clang‑tidy … sometimes.
Security tries to staple Sonar or a commercial SAST across all of it. Nobody reads the portal.

Engineers see 50+ PR comments, half duplicative, and they learn the wrong lesson: ignore the bots. As AI agents begin to auto‑refactor code and submit PRs for you, this problem compounds — fast.

The fix is not “one scanner to rule them all.” The fix is one format and one policy. That format is SARIF. The policy is “block only on new issues in changed lines, with a baseline and a risk budget.”

A SARIF‑first decision framework

1) When to standardize

You have 3+ languages, 20+ services, or 10+ engineers. Beyond this, bespoke tooling per repo stops scaling.
You have nearshore teams or external vendors. A single SARIF pipeline gives everyone the same ground truth.
Your backlog of “existing” SAST issues scares people. You need a baseline gate that doesn’t freeze delivery.

2) What to standardize on

Format: SARIF 2.1.0. Require it for all new analyzers. Wrap holdouts with converters.
Storage: Treat SARIF as a first‑class CI artifact, not a comment stream. Keep 12 months of artifacts in S3/Blob with immutable versioning.
Policy: Gate on “new high/critical in changed lines only,” with service‑specific exception budgets.

3) Where to run

PR‑time diff scans: Run fast linters and taint analyzers against changed files. Target under 10 minutes.
Nightly/weeklies: Full‑repo deep scans (CodeQL, whole‑program analyzers) that publish SARIF but don’t block PRs; they open issues only if severity ≥ High.
Pre‑release: Release branches get one mandatory deep scan with a short SLA for remediation or a documented risk acceptance.

4) How to gate

OPA in CI: Use Open Policy Agent to read SARIF and enforce org‑wide rules. Example: “Block if any new CWE‑79 or CWE‑89 on changed lines.”
Baselines: Record the first clean PR as baseline. Anything pre‑existing is backlog, not a blocker. Only deltas fail the build.
Risk budgets: Each service gets a monthly budget of “acceptable known issues” (ideally zero). Burn budget requires product sign‑off.

Reference architecture (works with GitHub or GitLab)

Ingestion

Each scanner emits SARIF. For tools without native SARIF, pipe their JSON/text into converters (Microsoft SARIF SDK, sarif‑multitool, or a simple translator).
CI bundles all SARIF files into a single artifact per job. Stamp with commit SHA, repo, branch, and build ID.
Upload the artifact to object storage (S3/Blob) and to your code‑hosting “code scanning” UI (both GitHub and GitLab speak SARIF).

Normalization and deduplication

Stable fingerprints: Prefer the SARIF “fingerprints” property. When missing, hash ruleId + normalized file path + line range + a snippet hash.
Cross‑tool dedupe: Map tool‑specific rules to CWEs when possible. If ESLint security‑xss and Semgrep detect the same sink/source, keep the higher‑fidelity finding as canonical.
Severity mapping: Converge on 4 buckets: Critical, High, Medium, Low. Don’t import vendor severity verbatim; create a mapping table once and keep it in code.

Baselines

Generate a baseline SARIF on the default branch. Mark pre‑existing findings with baselineState=existing.
Teach your policy to ignore existing findings unless they change files or escalate severity.
Review the backlog quarterly. Measure burn‑down, not raw counts.

Policy enforcement

PR checks: Run OPA with a rego policy that reads the merged SARIF and the diff. Rule of thumb: block only on new Critical/High in changed lines or on taint paths that originate in changed code.
Comment discipline: Max 10 bot comments per PR. Squash the rest into a single summary with links. If you spam, people will mute you.
Severity timeouts: Critical: fix or revert before merge. High: 72 hours. Medium: add to the backlog and schedule. Low: auto‑triage or ignore by default.

Presentation

Annotate code on diffs via the code host’s native “code scanning” UI. Engineers should not leave the PR to see the finding.
Provide a single read‑only dashboard for trends: new findings by severity/week, mean time to green, and backlog burn‑down. Don’t add another full‑time portal.
Send a weekly roll‑up to service owners. Keep it under one screen.

Tooling map: who already speaks SARIF

Good news: you don’t have to wait for vendors to catch up.

C/C++: GCC 16 (native SARIF), clang‑tidy/scan‑build (via converters), cppcheck (SARIF option).
JavaScript/TypeScript: ESLint and TypeScript compiler both support SARIF formatters; Node security scanners (npm‑audit alternatives) can be converted.
Python: Bandit and Pylint both export SARIF via formatters; mypy can be wrapped.
Go: go vet and Gosec export or can be wrapped to SARIF.
Java/Kotlin: SpotBugs/FindSecBugs via converters; multiple commercial tools export SARIF natively.
.NET: Roslyn analyzers and dotnet build can emit SARIF.
Security‑focused analyzers: CodeQL (native), Semgrep (native), Trivy for IaC/K8s (JSON→SARIF converter), Checkov (JSON→SARIF converter).

Glue it with Microsoft’s SARIF SDK or sarif‑multitool for merging and validation. Run schema validation in CI — if a tool emits broken SARIF, fail fast and pin versions.

Process: make it part of the work, not a side quest

Change‑only enforcement: Your developers should never be blocked by legacy issues they didn’t create. This single decision is the difference between adoption and revolt.
One hour a week, no exceptions: Every service team (yes, including nearshore contractors) holds a 30–60 minute triage. Review new Highs, pick two Mediums from the backlog, and close false positives. 6–8 hours of timezone overlap with Brazil is plenty to make this cadence stick.
Rotate a “static analysis steward”: One engineer per team owns the config for a sprint. They tune rules, not Security. It builds empathy and keeps rules relevant.
Train the bots: If you use AI code assistants, pipe their diffs through the same SARIF gates. Hold the agent to the same bar as a human PR.

Cost model: what you’ll actually spend

Ballpark 2026 pricing we see in the field (your mileage will vary):

GitHub Advanced Security / code scanning: typically per‑user pricing; expect mid‑two‑digits USD per user per month for private repos. Advantage: first‑class SARIF UX.
Semgrep Teams/Enterprise: roughly $20–$60 per developer per month for Teams; Enterprise quotes vary. Strong SARIF, good diff‑scan speed.
SonarQube/SonarCloud: lines‑of‑code tiering; entry tiers in the low thousands USD per year. SARIF support exists but you may prefer its native UI.
CodeQL: included for open‑source; enterprise pricing for private repos typically bundled with platform deals. Native SARIF, deep analysis.
Infra: CI minutes for PR scans (target under 10 minutes, under $0.10/PR at typical CI rates), object storage pennies per GB for artifacts.

For a 40‑engineer team across 25 repos, annual all‑in (tooling + infra) to run a SARIF‑first program lands between $25k–$80k. We routinely see teams recoup that in 1–2 quarters by recovering engineer time currently burned on noisy bot comments and fire‑drills.

Rollout in 12 weeks

Weeks 0–2: Baseline and buy‑in

Pick two services in different stacks. Wire ESLint/TypeScript and Bandit/Pylint or Gosec to emit SARIF.
Add SARIF validation and upload. Store artifacts in object storage with commit SHA keys.
Define your severity mapping and OPA policy. Agree on “block only on new High/Critical in changed lines.”

Weeks 3–6: PR checks and deep scans

Enable PR comments and checks with a 10‑comment cap plus a summary.
Add a weekly deep scan (CodeQL or Semgrep Pro rules) that opens issues but doesn’t block merges unless Critical.
Create the first baselines. Turn the historic backlog into a board, not a blocker.

Weeks 7–12: Scale and tune

Roll to all services. Mandate SARIF output for any new analyzer; add converters where needed.
Launch the weekly 30–60 minute triage and the steward rotation.
Publish a single dashboard with three charts: new findings/week by severity, mean time to green, backlog burn‑down. Nothing else.

What good looks like (real numbers)

Across recent client rollouts (20–60 engineers, 15–40 repos, mixed stacks), a SARIF‑first pipeline delivered:

35–55% fewer duplicate alerts in PRs by merging/deduping across tools.
30–40% faster mean time to green after introducing diff‑based gates and comment caps.
Zero “security portal” logins required by product engineers — everything happens in the PR.
Backlog burn‑down of 10–20%/quarter without freezing delivery, driven by the weekly triage.

The most common surprise: after the noise drops, teams stop arguing severity and start fixing code. That’s the whole point.

Common traps to avoid

Blocking on the backlog: Do this and adoption dies. Baseline first, then gate.
Parking everything in a new portal: If engineers have to leave the PR to see context, findings become theater.
Letting vendors set your severities: Create one mapping and make vendors adapt to you.
Unpinned analyzers: A rules update that flips 500 findings to “High” on a Friday will ruin your quarter. Pin, test, promote deliberately.
No ownership: If no team owns a finding, it won’t get fixed. Assign by service, not by tool.

Security hardening for the pipeline itself

Sandbox your scanners: Run in minimal containers with seccomp/AppArmor and read‑only mounts. They process untrusted code.
Network egress allowlists: Scanners should not phone home. If a tool requires updates, fetch via a controlled proxy.
Signed configs: Treat rule sets as code. Signed, reviewed, versioned. No mutable “click‑ops” in vendor UIs.
Attest the build: Use SLSA‑style provenance for CI runs that generate SARIF. Attach attestations to artifacts.

Why this matters for nearshore and distributed teams

When your engineers are split across San Francisco, Austin, and São Paulo, consistent tooling is culture. SARIF levels the playing field: one result format, one gate, one small set of rituals. With 6–8 hours of overlap to Brazil, weekly triage actually happens. And when you bring a nearshore partner on, they plug into your pipeline Day 1 without arguing about scanners — your rules, your mapping, your gate.

The bottom line

Standardize everything you can safely standardize. SARIF does that for static analysis. You’ll still want different scanners for different languages and risk profiles. Fine. But you want exactly one way to talk about what they found, one place to see it in the PR, and one policy to decide if the code ships.

Key Takeaways

Stop chasing “one best scanner.” Standardize on SARIF and a single enforcement policy.
Gate only on new High/Critical issues in changed lines; baseline the rest.
Use OPA in CI to enforce severity and diff‑aware rules consistently across repos.
Cap bot comments and keep findings in the PR UI; dashboards are for trends, not triage.
Pin analyzer versions, sandbox scanners, and store SARIF artifacts for 12 months.
Adopt a weekly 30–60 minute triage and a steward rotation; measure mean time to green and backlog burn‑down.
Expect 30–50% less noise and materially faster PR cycles within one quarter.