2026-06-23 · 10 min read

HTTP QUERY Is Here. Should You Rewrite Your Search Endpoints?

By Diogo Hudson Dias

Engineer in a São Paulo office examining dashboards showing a rise in HTTP QUERY method traffic on large monitors.

You probably ship “search” as a POST to /search with a JSON body because your filters won’t fit in a URL. It works—but you gave up cacheability, muddied your observability, and taught every client a bespoke payload shape. The new HTTP QUERY method aims to fix that: a safe, idempotent request that allows a request body like POST, but intends no state change like GET. The debate hit the front page again this week, with fresh explainers and heated threads. Let’s skip the bikeshedding and make a call: should you adopt QUERY in 2026?

What QUERY changes—and why you should care

Today you have three bad choices for complex reads:

GET with an overlong URL and awkward encoding of structured filters (risking proxy limits and signature headaches).
POST that semantically “reads,” which breaks default caching, misleads WAFs, and confuses analytics.
Roll your own RPC over POST and pray future you remembers the rules.

QUERY proposes a fourth path: treat complex reads as safe, idempotent operations with a request body. That sounds boring. It’s not. It’s a chance to reclaim performance, reduce cloud bills, and un-weird your API semantics.

The practical benefits, quantified

CDN and edge cache wins: Search-heavy apps routinely see 30–60% of total requests hitting read endpoints. Even a conservative 20–30% edge cache hit rate on QUERY can shave 30–80 ms off p95 latency and trim 10–25% of origin CPU for list/search workloads.
Cleaner WAF and observability: Safe + idempotent means different default treatment. You get clearer dashboards, simpler rate limits, and less time tuning rules to admit benign POSTs.
Better client ergonomics: No more URL gymnastics. Agents and SDKs can send structured filters without reverse-encoding JSON into query strings.

But there are pitfalls. Middleboxes, browsers, SDKs, and specs were built around GET/POST. QUERY is new, which means sharp edges. Let’s map them out.

A CTO’s decision framework for adopting QUERY

1) What’s your actual problem profile?

If your read traffic is dominated by low-cardinality, highly cacheable lists and facets (think: catalog browse, top-N feeds), QUERY offers real ROI.
If your reads are deeply personalized with low reuse (think: per-user dashboards, ephemeral AI agent state), caching gains will be limited; don’t expect miracles.
If you’re fighting URL length limits through proxies or signing headaches in GET, QUERY’s body solves that cleanly.

2) Can your network path carry a new method?

Plenty of infrastructure drops or mishandles unfamiliar verbs by default. Before committing, run a method pass-through test across every hop:

Client SDKs and mobile stacks: Verify they allow arbitrary methods and don’t silently coerce to POST. Modern browsers and fetch allow non-standard methods, but they will trigger CORS preflight cross-origin.
Edge/CDN: Cloudflare, Akamai, and Fastly can pass unknown methods; caching them is a separate, explicit configuration step.
Load balancers and API gateways: AWS ALB/NLB, API Gateway, GCP Load Balancers, NGINX, Envoy, and HAProxy can pass arbitrary methods, but WAF rule sets may block by default. Test with a synthetic endpoint that echoes :method.
Service mesh: Envoy/Istio support custom methods, but check route matchers, RBAC, and authz filters that may enumerate verbs.

Set up a 48-hour canary that issues QUERY to a no-op route across your production path and monitor:

4xx/5xx by hop (edge, LB, gateway, service)
Unexpected method remapping (POST/GET substitution)
WAF blocks and bot protection challenges

3) Do you need browser support—or is this server-to-server?

If your front end will call QUERY from the browser, accept the CORS and preflight cost. Non-simple methods trigger an OPTIONS request on every cross-origin call unless cached. On high-churn UIs, that’s noise you’ll feel.

Mitigation: co-locate your API under the same eTLD+1 as your app to avoid CORS entirely, or preflight-cache aggressively with Access-Control-Max-Age (e.g., 600–3600 seconds where safe).
If you can’t avoid cross-origin, measure the preflight hit rate; in many SPAs, 30–50% of first-time reads per route will incur preflight without tuning.

4) Do your cache and key strategy make sense for QUERY?

Caches key lookups by method + URL (+ sometimes headers/body). With QUERY, your request body now participates in cache identity. That’s powerful—and dangerous.

Canonicalize request bodies: Sort JSON keys and drop empty/nulls so semantically identical filters collide in cache instead of splintering.
Bound cardinality: Add server-enforced normalization (e.g., clamp page size to 100, top-N facets) to avoid “unbounded cache key” explosions.
Content negotiation: If your endpoint can return multiple representations, ensure Vary is right-sized; don’t accidentally include broad headers that blow your hit ratio.
CDN configuration: Most CDNs do not cache non-GET by default. You’ll need explicit rules or programmable edge logic to cache QUERY. Start with a small TTL (e.g., 10–60s) and ratchet up.

5) Will your API description, codegen, and clients keep up?

OpenAPI 3.1 enumerates methods and doesn’t yet include QUERY. Many generators assume the standard verbs.

Short-term workaround: Document QUERY endpoints as POST in OpenAPI with an x-http-method extension and implement real QUERY at the gateway. Generate clients off POST; flip to QUERY under the hood for S2S call paths first.
Consider dual exposure: Offer both POST and QUERY temporarily. Use POST for browser clients if CORS friction is high; route server-to-server via QUERY for cache and semantics.
Don’t forget SDK ergonomics: Normalize filter objects and offer a stable builder so clients don’t generate unique-but-equivalent bodies that crush your cache.

6) Are your auth, WAF, and rate limits method-aware?

Security stacks routinely special-case GET vs POST. QUERY will likely inherit GET-like expectations but with a body that could trip baselines.

WAF tuning: Inspect body size limits and parsers. If you parse JSON for anomaly detection on POST only, extend that to QUERY or you’ll miss threats. Conversely, don’t over-trigger on benign QUERY payloads and tank p95.
CSRF model: QUERY is safe, but browsers can still send it with credentials. Maintain your CSRF story (same-site cookies, double-submit tokens for state-changing routes). Don’t whitelist by method alone.
Rate limits: Create separate buckets for QUERY to avoid starving mutations under shared quotas. Safe methods can have higher ceilings with tighter burst shaping.

A pragmatic migration plan

Phase 0: Inventory and baseline (2 weeks)

Inventory all read endpoints with bodies today (POST-used-as-GET). Rank by RPS and cacheability (repeat rate, personalization, TTL tolerance).
Establish current p50/p95 latency, origin CPU time, and CDN hit ratios for those endpoints. Expose a dashboard broken down by method and route so gains are visible.

Phase 1: Path hardening and canary (2–4 weeks)

Run the method pass-through canary end to end. Fix WAF and gateway rules. Add method labels to logs, traces, and metrics.
Teach your edge to cache non-GET conditionally. In Fastly, that means VCL/Compute@Edge logic; in Cloudflare, Workers/Rulesets. Start with a staging property and synthetic keys based on a canonicalized body hash.

Phase 2: Dual-route the top candidate (4–6 weeks)

Expose your highest-RPS, cache-friendly search as both POST and QUERY to a single handler. Normalize request bodies server-side to a canonical representation before hitting storage.
Route server-to-server traffic to QUERY first (internal services, cron, agents). Keep browsers on POST until you’ve measured CORS/preflight impact.
Turn on conservative edge TTLs for QUERY (10–60s). Watch for key explosions and hot keys.

Phase 3: Expand and optimize (ongoing)

Roll out to the next 2–3 endpoints if you see ≥15% origin offload and ≥20 ms p95 improvement without error spikes.
Raise TTLs where correctness allows; add soft invalidation via background refresh to avoid stale bursts after TTL expiry.
Tighten quotas: Give QUERY a larger bucket but stricter per-IP bursts to defend search infra under event spikes.

Engineering details you’ll trip over (so fix them now)

Canonicalization rules

Implement a single, shared function that turns an arbitrary filter object into a canonical byte sequence:

Sort keys recursively, drop empty arrays and nulls, coerce numbers to a fixed format, clamp ranges to server-enforced bounds.
Serialize to compact JSON (no spaces) and hash (e.g., SHA-256) for the cache key component.
Document these rules publicly so clients can predict behavior and avoid cache misses.

Pagination and consistency

QUERY makes it tempting to cache paginated results. Do it, but define consistency:

If you promise “mostly fresh,” target 10–30s TTL and accept minor drift on fast-changing inventories.
If you need read-your-writes, don’t cache user-specific views or add a cache-bypass when the session has recent writes (e.g., within 5s).
Use stable cursor-based pagination; include the cursor in the canonicalization to avoid mixing incompatible pages.

Search correctness and vector backends

If you’re fronting heterogeneous backends (SQL for filters, vector DB for relevance), centralize fusion in the QUERY handler. Cache only the final, merged ranking, not partials, unless you have a proven benefit stacking partial caches. In practice, most teams over-engineer this; start simple.

Analytics and SLOs

Add QUERY to your golden signals and error budgets. Don’t bury it under “other.” Create dashboards that split method performance and origin offload explicitly. For incident response, ensure runbooks and customer-facing status pages know about QUERY; you’ll get fewer false alarms if your team can see that a QUERY-specific cache rule is misbehaving instead of “the API is slow.”

SDK and agent ergonomics

Agents are noisy clients. Small differences in serialization blow up caches. Ship a thin client that:

Canonicalizes request bodies client-side before send.
Applies conservative defaults: bounded page sizes, deduped filters, stable field ordering.
Retries safely with exponential backoff on network glitches (QUERY is idempotent—use that).

Trade-offs you should not hand-wave

Ecosystem maturity: Tooling, OpenAPI, and off-the-shelf SDKs don’t fully speak QUERY yet. You’ll write glue code and extensions. Budget the time.
Browser preflight tax: If your front end is cross-origin, expect 1 extra RTT on cold calls unless you restructure hosting or tune CORS caching.
Middlebox surprises: Some corporate networks and ancient proxies will still choke on unknown verbs. Keep POST fallbacks for critical flows until your telemetry proves it’s safe to drop.
Cache correctness discipline: Once bodies participate in cache keys, sloppiness hurts. Without canonicalization and guardrails, you’ll turn your CDN into a key explosion machine.

What good looks like at the end

After a disciplined rollout, here’s what you should see on your graphs:

Top search and browse routes moved from POST to QUERY (or dual-exposed), with edge hit rates between 20–40% and origin CPU for read paths down 10–25%.
p95 on list/search down by 30–80 ms for global users; tail spikes reduced during traffic surges.
Cleaner alerting: WAF false positives down; incident pages that can isolate QUERY-specific regressions quickly.
SDK adoption: 80%+ of internal services and agents using the canonical client, with cache-friendly payloads.

Where nearshore teams fit

This is a perfect project for a focused nearshore pod: networking, gateways, edge logic, SDK ergonomics, and observability tweaks. It’s cross-cutting and requires daylight overlap with US teams to touch edge config safely (we plan around 6–8 hours of overlap). The work is not glamorous, but it pays for itself via infra savings and a simpler mental model for everyone building on your API.

Recommendation

If your product has meaningful read traffic on POST-with-body today, pilot QUERY this quarter. Start server-to-server, keep browser POST fallbacks until your CORS story is clean, and insist on canonicalization from day one. Use explicit cache rules, watch your key space, and make the wins visible to finance and product. If your reads are mostly personalized or write-adjacent, you can wait—just stop adding more POST-as-GET endpoints. Design them as QUERY now, even if you temporarily expose POST for compatibility.

Key Takeaways

QUERY gives you GET-like semantics with a request body—perfect for complex reads without URL hacks.
Real gains: 20–40% cache hit rates on the right endpoints, 10–25% origin offload, and 30–80 ms p95 wins.
Hard parts: middleboxes, OpenAPI/SDK gaps, CORS preflight, and cache key discipline.
Adopt in phases: canary the method, dual-route top endpoints, start S2S first, then expand to browsers.
Don’t wait for the ecosystem to be perfect—just keep POST fallbacks and publish canonicalization rules.