Back to blog
6/1/20267 min readPlannerPoker Team

GitHub Copilot AI Credits Start Today: Estimate AI Cost Before the Sprint

GitHub Copilot moved to usage-based AI Credits on June 1, 2026. Here is how agile teams should plan story points, agent usage, budgets, and review work before sprint commitment.

A neural network diagram representing AI credit usage in sprint planning
A neural network diagram by Loxaxs, released under CC0 via Wikimedia Commons. Source CC0 1.0

GitHub Copilot's usage-based billing switch starts today, June 1, 2026. For engineering teams, that is more than a pricing update. It is a planning signal.

GitHub says Copilot plans are moving from premium request units to GitHub AI Credits. Credits are consumed based on token usage, including input, output, and cached tokens. Code completions and Next Edit suggestions remain included in paid plans, but higher-effort features such as Copilot Chat, Copilot CLI, Copilot cloud agent, Copilot Spaces, Spark, and third-party coding agents consume credits. Copilot code review can also consume GitHub Actions minutes.

That changes how teams should talk about AI-assisted work in sprint planning.

When agentic coding was treated like a flat subscription, many teams planned as if the agent was always-on free capacity. Usage-based billing makes the hidden cost visible. A quick question, a small autocomplete, a long repo-wide refactor, and an agentic code review are not the same kind of work. They should not be estimated as if they are.

AI cost is now part of story scope

Story points should still measure relative effort, complexity, risk, and uncertainty. They should not become a direct dollar conversion.

But if a story needs heavy agent usage, that matters. AI cost is one more sign that the story may be broader than it looks.

A small UI fix might use code completions and a short chat. A larger agent-assisted task might include:

  • Multiple repo-wide context reads.
  • Long Copilot Chat or CLI sessions.
  • Cloud agent work across branches or pull requests.
  • AI-assisted code review.
  • Repeated retries after failed tests.
  • Frontier-model usage for complex reasoning.
  • Extra validation because generated code touches risky paths.

Those activities consume budget, review time, and attention. The team should surface them before sprint commitment, not after the invoice or the incident.

The planning question is not just "can AI do it?"

The better question is: what kind of AI work does this story require?

That question belongs in backlog refinement and planning poker.

Before the team votes, ask:

  • Is this a simple coding-assistant story or an agentic workflow story?
  • Will the agent need broad repository context?
  • Will it touch billing, auth, permissions, data retention, or infrastructure?
  • Will Copilot code review be part of the definition of done?
  • How many retries or review loops are likely?
  • Is the team relying on a shared enterprise credit pool?
  • Are budget controls likely to block a developer mid-story?

If nobody knows the answers, the story is not ready for a confident estimate.

Usage-based billing makes spread more useful

Planning poker is valuable because it reveals disagreement before work starts. Copilot AI Credits give teams a new reason to pay attention to the spread.

Imagine a story where one developer votes 3 and another votes 13.

The low voter may be assuming:

  • The agent can draft most of the implementation.
  • The code path is isolated.
  • Existing tests are enough.
  • The work will use mostly completions and small chats.

The high voter may be assuming:

  • The agent will need a long multi-file session.
  • The generated code will require careful review.
  • The story will consume meaningful shared AI credits.
  • Copilot code review and Actions minutes are part of the workflow.
  • Budget caps may interrupt the work.
  • The feature touches production-sensitive logic.

That disagreement is useful. It shows that the team is not estimating the same delivery path.

Do not average the numbers too quickly. Ask each side what AI usage they assumed.

Add an AI usage assumption to the ticket

The final estimate should carry a short note about the role of AI.

Good Jira notes might say:

  • "Three points. Mostly UI work; AI use limited to completions and short chat."
  • "Five points. Agent can scaffold tests, but auth changes need human review."
  • "Eight points. Expected long repo-wide agent session plus code review and rollout checks."
  • "Split before sprint. Refactor can be agent-assisted, but billing logic needs separate review."
  • "Estimate assumes Copilot Business pool has capacity and user-level budget will not block work."

These notes are not bureaucracy. They help the next planning session, the reviewer, and the person managing AI spend understand why the team chose that estimate.

Budget controls are sprint planning inputs

GitHub's docs describe several budget controls for organizations and enterprises, including user-level budgets, cost-center budgets, enterprise budgets, and organization-level budgets. Those controls decide whether usage is served, metered, or blocked.

That means budget settings can become delivery dependencies.

If an enterprise uses a strict user-level budget, a developer can be stopped even when the organization still has pooled credits. If paid usage is disabled, work can stop when the shared pool is exhausted. If cost-center or enterprise limits are not configured carefully, teams may either overspend or get blocked at awkward times.

For sprint planning, this creates a practical checklist:

  • Do heavy AI users have enough budget for planned work?
  • Are shared credits pooled across the right billing entity?
  • Is paid usage allowed or blocked after the pool is exhausted?
  • Are cost-center limits aligned with sprint priorities?
  • Does the team know what happens when a developer is blocked?
  • Is there a fallback workflow if the agent cannot continue?

If those answers are unclear, the estimate should include that uncertainty.

AI cost and delivery stability are linked

The wider tech conversation is moving in the same direction. Recent reporting on AI-assisted development points to a familiar pattern: teams using AI coding tools frequently can move faster, but faster coding does not automatically mean safer delivery. Review, QA, remediation, validation, and production recovery can become the new bottlenecks.

Usage-based billing reinforces that lesson. The expensive story is not always the story with the most code. It may be the story with the most ambiguity, the most retries, the broadest repository context, or the most review risk.

Planning poker can catch that early.

When the estimate spread is wide, ask:

  • Are we paying for implementation speed while underestimating validation?
  • Is the AI-generated draft likely to create review work later?
  • Does the story need stronger acceptance criteria before an agent starts?
  • Should the team split exploration, implementation, and review into separate stories?

The goal is not to make developers afraid of using AI. The goal is to make the work visible enough to plan responsibly.

A better definition of ready for AI-assisted stories

For teams using Copilot, Codex, Claude Code, Cursor, or any other coding agent, a story is ready when the team can describe both the product outcome and the AI operating model.

A ready story should include:

  • The user outcome.
  • Acceptance criteria.
  • Data and permission boundaries.
  • Known failure states.
  • Expected AI role.
  • Human review requirements.
  • Test and rollout expectations.
  • Budget or usage assumptions for agent-heavy work.

If a story lacks those details, use AI to help prepare the conversation. Ask it to summarize the ticket, find missing acceptance criteria, compare similar past work, and suggest review risks. Then let the humans estimate privately and discuss the spread.

That order matters. AI can prepare the room, but the team should still decide what good looks like.

The takeaway for June 1

GitHub Copilot's AI Credits launch makes one thing clearer: agentic coding has real operational shape. It consumes tokens, credits, review effort, Actions minutes, budget headroom, and human judgement.

That does not make AI-assisted development less useful. It makes planning more important.

The teams that get the most value from coding agents will not be the teams that pretend AI is free capacity. They will be the teams that estimate the whole workflow: agent usage, review, test coverage, rollout, cost controls, and recovery.

Planning poker is a simple way to keep that conversation human. Vote privately. Reveal the spread. Ask what AI usage each voter assumed. Record the rationale. Then let the agent help inside boundaries the team understands.

Sources