AI Coding Dependence Is Now an Estimation Risk
Developers are leaning hard on AI coding tools, but AI-generated code still needs review. Here is how sprint planning and planning poker should respond.

The latest AI coding news is no longer just about whether agents can write code. It is about how deeply teams now depend on them.
On May 29, 2026, TechCrunch covered a striking developer behavior shift: many programmers are becoming reluctant to work without AI coding tools, even for research tasks designed to measure productivity. That is not a small tooling preference. It is a planning signal.
When a team cannot imagine implementing a story without Copilot, Codex, Devin, Cursor, Claude Code, or another agent, the estimate is no longer measuring only human implementation effort. It is also measuring prompt quality, agent supervision, review effort, test coverage, and the cost of fixing whatever the agent gets almost right.
That is why AI coding dependence belongs in sprint planning.
AI speed is not the same as delivery speed
The productivity story is getting more complicated.
METR reported in February 2026 that its original early-2025 study found experienced open-source developers took longer on tasks when AI tools were allowed, even though the developers expected speedups. METR then tried to update the experiment with newer tools, but wider AI adoption made the study design harder: some developers did not want to participate if they might have to work without AI, and many were selective about which tasks they were willing to submit.
That does not prove AI is bad for development. METR explicitly says early-2026 tools may be speeding some developers up. But it does prove something agile teams should care about: once AI becomes part of the workflow, it is hard to measure the real work without also measuring the team's dependency on the tool.
That matters because a story can feel smaller when an agent writes the first draft, while the real delivery risk moves later into review, integration, maintenance, and production behavior.
The review bottleneck is becoming visible
Sonar's 2026 State of Code Developer Survey gives the planning conversation another useful anchor. Sonar says AI already accounts for a large share of committed code, but most developers do not fully trust AI-generated code. The same survey points to a verification bottleneck: reviewing AI code can require more effort than reviewing code written by a teammate.
That is exactly the kind of hidden work that planning poker is designed to reveal.
If one developer votes 3 because the implementation looks easy with an agent, and another votes 8 because the generated code will touch auth, data migration, observability, or customer-facing failure states, the disagreement is the valuable part. The gap tells the team that the work is not just "build the feature." It is "build the feature and prove the agent did not introduce a future incident."
AI-generated code can become technical debt
Recent research on AI-authored commits makes the issue sharper. A 2026 arXiv paper studying hundreds of thousands of verified AI-authored commits found that AI-generated changes can introduce issues that persist in repositories over time. The authors argue for stronger quality assurance in AI-assisted development.
For sprint planning, the lesson is practical:
- A low-effort AI draft can still create high-effort maintenance.
- Generated tests may cover the happy path but miss business-specific risk.
- Code review needs to include architecture, security, and product behavior, not just style.
- The estimate should include cleanup time when an agent touches unfamiliar parts of the codebase.
- A story is not done because the agent produced a pull request.
This is not a reason to avoid AI coding tools. It is a reason to estimate them honestly.
How planning poker should change
Planning poker works best when every voter thinks independently before seeing the group. That is even more important in an AI-assisted team.
Before voting, the team should ask:
- What parts of this story are safe for an agent to draft?
- What parts need senior human design before the agent starts?
- Which files, services, permissions, or data flows are high risk?
- What would make the AI-generated solution look correct while still being wrong?
- What quality gates must pass before this story counts as done?
Then vote on the whole delivery path, not just the code generation step.
If the vote spread is wide, do not average the numbers too quickly. Ask the low voter what the agent can automate. Ask the high voter what review or maintenance risk they see. The final estimate should include both.
Add an AI review line to the story
Teams using Jira or a similar tracker should start writing estimate rationale in a way that future humans and future AI systems can understand.
Useful notes sound like this:
- "Three points for implementation, plus review because the agent will touch billing logic."
- "Five points because the agent can draft the UI, but accessibility and analytics need manual validation."
- "Eight points because the story crosses auth, data retention, and admin settings."
- "Split this: migration plan and generated implementation are different risks."
- "Agent allowed for tests and refactor, human design required before database changes."
Those notes turn estimation into a lightweight governance layer. They also make future planning easier because the team can see why the original number was chosen.
A better definition of ready for AI-assisted work
AI-assisted teams should tighten the definition of ready before a story enters sprint planning.
A ready story should include:
- The user outcome.
- Acceptance criteria.
- Known failure states.
- Data and permission boundaries.
- The intended AI role.
- Required human review areas.
- Test and rollout expectations.
If the team cannot state those pieces, the story may still be useful for backlog refinement, but it is not ready for sprint commitment.
The takeaway for agile teams
AI coding dependence is now part of software delivery reality. Developers will keep using agents because the tools are useful, fast, and often genuinely pleasant to work with.
But planning cannot pretend the agent is magic free capacity.
The strongest teams will estimate the full system around AI coding: prompt preparation, generated implementation, review, tests, security checks, production validation, and long-term maintainability.
Planning poker gives teams a simple way to keep that judgement human. Vote privately. Reveal the spread. Discuss the risks. Let AI support the work after the team has decided what good looks like.
In 2026, the planning question is not "can AI write the code?" The better question is "what human judgement is still required before this code should ship?"
Sources
- Coders are refusing to work without AI - and that could come back to bite them, TechCrunch
- We are Changing our Developer Productivity Experiment Design, METR
- Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild, arXiv
- State of Code Developer Survey report: The current reality of AI coding, Sonar