Guide

The AI Adoption Playbook for Engineering Teams

Five chapters drawn from the articles I keep writing. Ungated. No sign-up wall. One optional form at the end if you want the toolkit.

Chapter 1

Figuring Out If You're Ready

Every headline says AI is changing engineering. Nobody is talking to the team manager wondering if any of this applies to them yet.

Readiness isn't yes or no. It's a spectrum with four levels, and your position on it tells you what to do next.

Level 1 — Not Ready Yet. Your workflows aren't defined. Things happen differently every sprint depending on who's on-call. You can't automate what you can't describe. Map your processes first; it will improve your team before any technology enters the picture.

Level 2 — Ready for AI-Powered Tools. You have defined workflows. Your code review process is stable. Your test suite runs. This is where most mid-market teams sit — and where ChatGPT, Claude, and Copilot deliver real value without custom work. Turn on features in tools you already have. Budget: basically nothing.

Level 3 — Ready for Custom AI. You have clean workflows, good telemetry, and clear ROI targets. You've used off-the-shelf AI and seen the gaps. Now you're ready to build convention files, custom agents, or internal tools. Budget: $10–50K plus ongoing.

Level 4 — Ready for AI Strategy. AI is integrated across your SDLC. You need someone to architect the whole thing — convention file discipline, model routing, observability, cost controls — not build individual pieces.

Before you invest anything, ask five questions: Can you describe your core workflows step by step? Where is your team spending the most time on repetitive work? Do you have metrics, or are you running on gut feel? What's your realistic budget? And critically — who would own the AI rollout? Every initiative without an owner dies the same death.

The worst move is spending $50K on AI tooling while your core processes run on tribal knowledge. The best move is figuring out where you actually are, then taking the next step from there.

Read the full breakdown →

Chapter 2

Why Your AI Tools Produce Bad Output

Most developers blame the model when AI coding tools produce bad output. That's wrong. The real problem is context.

Context engineering is the practice of deliberately shaping what an AI agent knows before it starts working. That means writing project context files like CLAUDE.md, AGENTS.md, and .cursorrules. It means scoping conversations to single tasks instead of letting them sprawl. It means encoding repeatable workflows as skills and delegating subtasks to subagents with their own fresh context windows.

The gap between "AI tools are a novelty" and "AI tools have compounding productivity" is not a better model. It's a convention file committed to your repo, a CLAUDE.md that captures actual patterns, and a team habit of updating it when the answer changes.

The pattern that works: treat context like production code. It's versioned, reviewed, and maintained. New engineers read it to understand how the team works. When a convention file is stale, the AI tool produces stale output. Keep it fresh the same way you keep your tests fresh.

Your CLAUDE.md should answer three questions: what is this codebase, what are the conventions, and what are the common mistakes to avoid. Your AGENTS.md (or equivalent) should be agent-neutral and cover the same ground for Cursor, Codex, Cline, or whatever your team uses alongside Claude Code.

Most teams skip this step because it feels like overhead. Six months later they wonder why their AI pilot plateaued. The teams that invest the day to write decent context files are still compounding output a year in.

Read the full breakdown →

Chapter 3

Which AI Tools Actually Matter

Four major coding models launched in six days earlier this year. The benchmark gap between the top proprietary model and the top open-source model is under three points. The price gap is 45x.

That's the shape of the model market in 2026. The "pick the best model" question has shifted from "which one scores highest" to "which one gives you the best cost/capability tradeoff for a given task." Classification and formatting work fine on the cheapest tier. Heavy reasoning still benefits from frontier models. Most production traffic is simpler than teams assume and runs fine on cheaper models.

That's where LLM gateway architecture comes in. The moment you're calling three or more LLM providers from multiple services with no routing layer, you're losing visibility into cost, failure modes, and performance. A gateway gives you one place to route, fail over, track spend per team or per feature, and enforce guardrails.

Teams that implement cost-based routing typically report 30–50% reduction in LLM spend. The math is straightforward: route a classification task from Opus to Sonnet and save 40% on tokens. Route it to Haiku and save 80%. Most teams are sending everything to their most expensive model by default and calling it a hard cost.

You don't need a gateway on day one. The clearest triggers: three or more services making LLM calls, monthly LLM spend above $3K, or a need for provider redundancy when a single vendor has an outage. Below those thresholds, single-provider routing with tiered models handles most use cases.

Tool selection isn't about picking the "best" model. It's about matching the model to the task and having the infrastructure to change your mind later without rewriting everything.

Read: The AI Coding Model Wars →
Read: LLM Gateway Architecture →

Chapter 4

The 5 Ways AI Pilots Die

Anthropic surveyed 132 of their own engineers about Claude Code. Self-reported productivity gains between 20% and 50%. Then someone checked the organizational dashboard. The delivery metrics hadn't moved.

That gap — between individual productivity and organizational delivery — is where most AI pilots die. Here are the five failure modes I keep seeing:

1. No convention files. Output quality varies wildly between developers because each engineer is building their own mental model of how to prompt the tool. Without a committed context file, there's no institutional memory.

2. No measurement plan. Teams adopt AI, feel faster, and never instrument anything. Six months in they can't tell leadership whether ROI is real. The only number anyone has is "68% of engineers say they're more productive" — which is exactly the number Anthropic had, and their delivery metrics didn't move.

3. Individual heroes, no team practice. One engineer ships 3x because they figured out the workflow. The rest of the team is stuck. Without a deliberate effort to codify what's working, the 3x stays trapped in one person's muscle memory.

4. Code review becomes the bottleneck. AI generates code faster than your review process can handle. The PR queue grows. Reviewers start rubber-stamping. Quality drops in ways that don't show up until two sprints later when a subtle bug ships to prod.

5. Senior skepticism that never gets resolved. The senior engineers — the ones whose judgment you most need to trust AI tools — are also the ones most likely to notice the rough edges. If nobody takes that skepticism seriously and responds with better convention files and sharper workflows, the pilot quietly dies at the top of the org chart.

The fix for all five is the same: treat AI adoption as a team practice, not a collection of individual productivity boosters. Convention files, measurement, code review workflow changes, and deliberate skill transfer from the fastest adopters to the rest of the team.

Read the full breakdown →

Chapter 5

The Math That Makes the Decision

Every AI adoption conversation eventually comes down to one question: is this worth it?

The math is simpler than the frameworks make it look. Three tiers cover almost everything you'd actually build:

Tier 1 — $0 to $500/mo. SaaS AI features you already pay for (GitHub Copilot seats, Claude Code subscriptions) plus off-the-shelf tools. Payback is measured in weeks. The risk is almost zero. If you can't justify spending $20/developer/month on a tool that saves each developer an hour per week, something else is wrong.

Tier 2 — $2K to $15K one-time plus $50–500/mo to run. Custom agents, convention file builds, internal tooling on top of existing models. Payback is measured in months. This is the tier where context engineering becomes the dominant variable — the same $10K build produces radically different outcomes depending on whether your team has good convention file habits.

Tier 3 — $15K+. Integrated systems. Multi-agent workflows. Custom infrastructure. LLM gateways, observability, guardrails. Payback is measured in quarters. Only pursue this tier if you already have measurable wins in Tier 2 and a specific bottleneck Tier 2 can't solve.

The ROI math works like this: manual cost per month minus automated cost per month, divided into the one-time build cost. That gives you the payback period in months. If the number is under six and your process is stable, do it. If the number is over twelve or the process keeps changing, don't.

The single best case study I can point to: a manual vehicle audit process I helped replace at Ford Motor Company. Auditors were driving hundreds of miles to physically count cars at dealerships, discovering discrepancies weeks after the financial impact. The automation replaced that with an IoT telemetry platform tracking 450,000+ vehicles in real time. Annual savings: roughly $5 million in operational cost and interest rate carry. The build cost was a fraction of a single quarter of the manual run rate.

The worst move is spending real money on AI tooling while your core processes run on memory and sticky notes. The best move: pick the smallest problem with the clearest ROI math, ship it, measure it, and let the numbers tell you what comes next.

Read the full breakdown →

Get the Toolkit

If you want to run this framework with your team, I put together supporting files:

AI Readiness Diagnostic Worksheet (Excel / Google Sheets) — 7-dimension scoring rubric with level-based next steps. Run it in a team meeting.
Convention Files Starter Pack — CLAUDE.md and AGENTS.md templates from real production usage, ready to drop into your repo.

Free, no upsell. Drop your email for instant access.

Instant access. No nurture sequence. Unsubscribe anytime.