2026-05-22 · 10 min read

Fix Python Environments in 2026: A CTO Plan with uv, Lockfiles, and Real Repro

By Diogo Hudson Dias

Team lead in a São Paulo office configuring a Python project with dependency installs visible on a laptop and external monitor

If your team still loses half a day to “why does this import fail on my laptop but not in CI,” that’s not a developer problem. It’s a platform problem. And in 2026, you can finally fix it without building your own package index or yelling at pip flags.

Hacker News is busy debating whether uv’s package management UX is a mess. They’re not wrong about the edges. But the noise hides the signal: Python’s build, packaging, and isolation story matured enough that you can set a standard and end the drift. If you’re running AI-heavy repos (PyTorch, CUDA, PyArrow, on-device inference), the ROI is no longer theoretical—it’s visible in cycle time and incident counts.

The stakes: drift is now a tax on AI velocity

In AI teams, dependency trees are taller, native builds are common, and platform variation exploded (macOS ARM on dev, Linux x86_64 in prod, sometimes Windows in the loop). You pay for that complexity three ways:

Incident cost: A conservative estimate is 1–2 hours per engineer per week lost to environment issues on AI repos. On a 15-person team at $150/hour, that’s $2,250–$4,500/week—or $120K–$230K/year—burned on friction.
CI latency: Resolving and compiling dependencies adds 5–10 minutes to cold CI jobs. Multiply by 30 runs/day and you’ve added 2.5–5.0 engineer-hours of waiting time daily.
Production surprises: Transitive upgrades, optional extras, and missing wheels are the source of “it passed CI, failed in prod” outages you don’t log as packaging—but you should.

All of this is solvable with a boring platform decision and four policies you enforce like you enforce code review.

The decision: pick a tool and write it into policy

You have to pick one of these families and standardize it across new repos within 90 days:

uv (the new hotness; fast resolver and installer; lockfiles; a single tool for run/install/virtualenv; strong caching). This is our recommended default in 2026.
Poetry (mature workflows; good locks; slower installs; solid ecosystem).
PDM (PEP-compliant, pragmatic; good lock story; lighter footprint).
Hatch (excellent for packaging and environments; less opinionated on deps).

pip alone is not a sufficient platform standard anymore. If you insist on staying with pip, you must bolt on lockfiles, hash checking, and reproducible wheels yourself. You’ll end up reinventing features uv and Poetry already ship.

Why uv as the 2026 default

You can argue style, but speed, determinism, and monorepo ergonomics win roadmaps, not opinions:

Speed: In practice, we see uv cut cold installs from 6–9 minutes to 60–90 seconds on pure-Python services, and shave minutes off heavy stacks by reusing prebuilt wheels.
Determinism: uv locks fully describe the environment, including markers and extras. When combined with hash checking and private wheels, you get predictability across macOS ARM, Linux x86_64, and Windows.
Developer ergonomics: One tool handles venv creation, package add/removal, scripts, and ephemeral CLI tools (uvx). That consolidation alone removes a mess of shell glue and pipx inconsistencies.

Yes, the UX is opinionated. That’s a feature at org scale: the number of ways to be right should be low. The number of ways to be wrong should be zero.

A CTO-grade standard: policies, not preferences

Here’s the minimal policy set that actually eliminates drift. Skip any one and you will regress.

1) Lock everything, everywhere

Project lockfiles are mandatory and must live in the repo. For uv, check in uv.lock alongside pyproject.toml.
FROZEN sync in CI: CI must install with a frozen lock (for uv, use the frozen sync mode) so resolution never happens in CI. If it isn’t in the lock, the build fails.
Upgrades are deliberate: Dependency bumps only via a dedicated “dep upgrade” PR that refreshes the lock and runs a smoke suite.

2) Separate locks per platform (or target triple)

Do not pretend one lock fits all. macOS ARM and Linux x86_64 resolve different wheels; Windows adds a third axis. Maintain platform-specific locks (e.g., uv.lock.linux-x86_64, uv.lock.macos-arm64).
Gate merges on cross-platform lock sync if your contributors run multiple OSes. Your CI should regenerate and compare locks for all supported targets and fail on drift.

3) Prebuild, cache, and go offline by default

Central wheelhouse: For heavy packages (torch, xgboost, scipy, pyarrow), prebuild or mirror vetted wheels into an internal index with SHA256 integrity. Point uv to the internal index first.
CI caches persist across jobs: Persist the uv cache directory between CI runs. Expect 60–80% install time reductions for common paths.
Offline test stage: Add a “no-network” CI stage that installs from lock + cache only. This catches accidental network dependencies and proves you’re mirror-ready when PyPI sneezes.

4) Enforce scripts and CLIs via uvx

Ban global tools: No “pip install –user black.” Standardize ephemeral tool runs via uvx (e.g., uvx ruff, uvx pre-commit), which downloads pinned versions on demand and reuses cache.
Record tool versions in the repo via scripts in pyproject so everyone runs the same formatters, linters, and codegens.

Migration playbook: four weeks, one repo at a time

You don’t need a big-bang cutover. Start with a representative Python repo—ideally one with native deps and some CI weight—and complete this in four weeks.

Week 1: Inventory and baseline

Classify projects by type: pure Python service, AI/training stack, CLI/tooling, library package. Identify OS targets for each.
Capture baseline metrics: cold install time, CI duration, and count of environment-related incidents in the last 90 days.
Decide on lock strategy: per-platform lockfiles and internal index scope (which packages get mirrored/prebuilt).

Week 2: Introduce uv and lockfiles in a pilot repo

Add a pyproject.toml if missing; encode top-level dependencies and scripts.
Generate uv locks for each target platform. Commit them. Document developer commands: create env, sync from lock, run scripts.
Update CI: use uv’s frozen sync; persist the cache; add a no-network stage that installs from lock + cache only.

Week 3: Wheelhouse and private index

Stand up or expand your internal Python index (Artifactory, Nexus, devpi, or a simple S3/static index with signed checksums).
Prebuild or mirror heavyweight wheels with hashes. Configure uv to prefer the internal index.
Add a job to refresh the mirror weekly and on demand during dependency PRs. Keep provenance: source URL, SHA256, build flags.

Week 4: Policy gates and scaling

Add checks: blocks on pip install in CI, enforcement of frozen sync, and lock drift detection.
Roll the pattern to two more repos. Share a template repo with uv, lockfiles, cache config, and scripts wired in.
Publish a 2-page internal standard: commands to onboard, how to upgrade deps, how to build offline, and who approves mirrors.

AI-specific wrinkles (and how to avoid the ditch)

CUDA and optional GPU extras

Split CPU/GPU locks. Maintain cpu-only and cuda-enabled locks. Most developers don’t need CUDA locally; they need determinism.
Encode extras like mypkg[cuda] in pyproject and lock the variant explicitly. Document which CI jobs exercise each.

Apple Silicon reality

Prefer projects with native ARM wheels. Where ROS/torch/scipy lack ARM wheels, push devs to Linux containers or remote dev servers rather than compiling on macOS.
Make WSL2 or remote Linux dev the default for Windows users working on native-heavy repos. It cuts setup time from hours to minutes.

Binary provenance and security

Hash pin every wheel in the lock or constraints. Fail CI on hash mismatch.
Scan dependencies with osv-scanner or pip-audit as part of the dependency PR. Track exceptions by package and severity.
Sign your in-house wheels and store signatures next to artifacts. Verifying provenance beats spelunking a compromised worker later.

Monorepos, workspaces, and Python version drift

Monorepos are where otherwise-good tooling goes to die. Keep it strict:

One Python minor per workspace unless you absolutely must support more. If you need 3.10 and 3.12, split the workspace or enforce toolchain separation via direnv/asdf and distinct uv caches.
Separate locks per package within the monorepo, but centralize mirrors and CI cache configuration.
Ban implicit editable installs across packages; wire local package dependencies via PEP 621 workspace references and lock them like external deps.

What “good” looks like in practice

Here’s a production-grade baseline you can copy:

Every Python repo has pyproject.toml, uv.lock.platform, and a Makefile or scripts section documenting uv run targets.
CI installs with frozen sync, without network in one stage, using a persisted cache directory.
Internal index serves prebuilt wheels for the top 20 heavy packages you rely on, all with SHA256 recorded. Dependency upgrades trigger a mirror refresh job.
Developers don’t install global tools. All clis run via uvx and are pinned in scripts. Pre-commit uses uvx too.
Dependency upgrades land via a weekly “dep bump” PR with a smoke test suite and SRE sign-off for any new mirrors.

Costs, savings, and the only metric that matters

After rollout, measure just two things for 60 days:

Mean time to first successful local run on a new machine. Target under 15 minutes for pure-Python repos, under 45 minutes for AI stacks that require large wheel downloads (not compiles).
Environment-related incidents per repo per month. Target near-zero. You’ll never be perfect, but you should be below one per repo per quarter.

The savings story is straightforward. If a 15-person team saves 1 hour/week each, that’s 15 hours/week—or $2,250/week at $150/hour—$117K/year. You’ll also shave minutes off CI for every PR. That time compounds into higher throughput, fewer “just rerun it” retries, and unblocked releases.

Trade-offs and where uv still hurts

No tool is magic. Here are the realities you have to accept:

UX churn: uv is moving fast. Lockfile formats and subcommands evolve. Pin a version in CI and upgrade quarterly, not daily.
Windows edge cases: Native builds still bite on Windows. Default to WSL2 for native-heavy projects to avoid toolchain hell, or provide a blessed devcontainer.
Cache bloat: uv’s speed comes from caching. Budget 20–50 GB per CI runner for Python caches if you do a lot of AI work. Prune aggressively.
Mixed-tool estates: You’ll live with Poetry/PDM in older repos for a while. That’s fine. The policy is “new repos on uv, old repos upgraded opportunistically.” Enforce the same lock and offline rules regardless of tool.

What about staying on pip with constraints?

You can, but you’ll rebuild a lot of plumbing:

Constraints plus pip-compile give you a partial lock. You’ll still deal with platform-specific wheel selection and subtle resolver differences across machines.
You’ll need to standardize virtualenvs, scripts, tool pins, and caches separately.
You won’t get uv’s install speed without replicating its caching and parallelism. That’s a lot of shell you will now maintain.

Given the cost of drift, it’s cheaper to adopt a single tool than to be in the packaging business.

Nearshore note: cross-platform discipline beats heroics

If you work with nearshore teams (Brazil, 6–8 hours overlap with US time zones), cross-platform consistency is part of the contract. The more deterministic your Python platform gets, the less you pay for “works on my machine” across macOS devs in São Paulo and Linux prod in Oregon. Our experience is that platform discipline saves at least one integration day per sprint when distributed teams converge on the same lock + cache norms.

Your next move

Don’t start with a committee. Pick the repo with the most packaging pain, adopt uv, enforce frozen sync, ship platform-specific locks, stand up a small internal index with your top 20 heavy wheels, and add an offline CI stage. Publish a two-page standard and require new repos to comply. In 90 days, your Python environment drama will look quaint.

Key Takeaways

Python is finally fixable at org scale. Standardize on uv (or Poetry/PDM) and enforce lockfiles.
Maintain platform-specific locks. One lock does not fit macOS ARM, Linux, and Windows.
Prebuild and mirror heavy wheels; run an offline CI stage to prove reproducibility.
Use uvx for pinned CLI tools. Ban global pip installs.
Target sub-15-minute first runs for pure-Python repos; sub-45 minutes for AI stacks.
Expect $100K+ annual savings for a 15-person team from reduced drift alone.
Accept trade-offs: cache bloat, occasional Windows pain, and version pinning for uv itself.