Catching AI-Generated Code Before It Ships

Catching AI-Generated Code Before It Ships

Sagnik

Founder, autter.dev

4 min read

AI coding assistants now generate somewhere between 30% and 60% of the code in a typical pull request. The code compiles. The tests pass. The logic reads cleanly. And yet production incidents from AI-authored code are climbing — not because the code is obviously broken, but because it is confidently wrong in ways that only surface under real load, with real data, at the worst possible moment.

This is the gap autter was built to close.

The shape of the problem

When a human writes a bug, it usually looks like a bug. A missing null check, an off-by-one error, a typo in a variable name. These are the kinds of issues that linters, type checkers, and CI pipelines were designed to catch — and they do catch them, reliably.

AI-generated code fails differently. It produces code that is syntactically perfect, logically coherent, and structurally sound — but semantically wrong in context. The failure modes are subtle:

  • Convention violations — the AI doesn't know your team deprecated moment in favour of date-fns last quarter, or that getUserById should never be called inside a loop because your ORM doesn't batch
  • Implicit contract breaches — a function that returns the right type but violates an unwritten invariant, like returning stale cache data where freshness is critical
  • Performance anti-patterns — N+1 queries hidden inside helper functions that look clean in isolation but collapse under production cardinality
  • Security blind spots — input validation that covers happy paths but misses edge cases like unicode normalisation attacks or timing-based side channels

Traditional CI catches none of these. Static analysis catches some. Human review catches more — but only when the reviewer has enough context, enough time, and enough suspicion to look closely at code that reads perfectly.

How autter addresses this

autter operates at the merge layer — after CI passes, before code reaches your main branch. It analyses every pull request with full awareness of your codebase's history, conventions, and architecture.

Contextual analysis at the codebase level

Unlike generic linters that check files in isolation, autter builds a semantic model of your entire codebase. It understands which patterns your team has established, which APIs are deprecated, and which modules have implicit performance constraints.

When it reviews a PR, it doesn't just check the diff — it evaluates the diff in context:

// autter flags this pattern automatically
async function getTeamMembers(teamIds: string[]) {
  // N+1 query — will execute one DB call per team ID
  // autter suggests: use db.teams.findMany({ where: { id: { in: teamIds } } })
  const members = [];
  for (const id of teamIds) {
    const team = await db.teams.findUnique({ where: { id } });
    members.push(...team.members);
  }
  return members;
}

autter has seen your codebase use findMany with in clauses in 47 other places. It knows this loop will generate one query per teamId. It flags it — not because loops are bad, but because this specific pattern in this specific codebase is a performance regression.

Convention drift detection

Every codebase has unwritten rules. autter learns them from your merge history:

What autter detectsExample
Deprecated API usageAI used legacy.createUser() instead of auth.register()
Naming convention violationscamelCase in a module that uses snake_case throughout
Import path deviationsDirect import from @internal/db instead of the team's @app/data facade
Error handling pattern breaksThrowing raw errors where the codebase wraps them in AppError
Test pattern mismatchesUnit test where integration tests are the established standard

Inline review comments

autter surfaces findings directly in your pull request as review comments — the same interface your team already uses. Each comment includes:

  1. What was detected — a clear description of the issue
  2. Why it matters — the specific risk or convention being violated
  3. How to fix it — a concrete suggestion, often with a code snippet
  4. Confidence level — how certain autter is that this is a genuine issue

Merge gate enforcement

For critical issues — security vulnerabilities, data integrity risks, breaking API changes — autter can block the merge entirely until the issue is resolved. For lower-severity findings, it adds informational comments and lets the team decide.

# autter.config.yml — customise enforcement levels
rules:
  security:
    severity: block          # prevent merge
  performance:
    severity: warn           # comment but allow merge
  conventions:
    severity: info           # informational only
  deprecated_apis:
    severity: block
    exceptions:
      - path: "legacy/**"    # known legacy code, don't block

What teams report after adopting autter

The numbers vary by team size and codebase complexity, but the patterns are consistent:

MetricBefore autterAfter autterChange
Production incidents from AI code~4.2 / month~1.1 / month-73%
Average review cycle time2.4 days1.1 days-54%
PRs merged per developer / week3.88.7+2.3x
Time spent on convention enforcement~6 hrs / week~0.5 hrs / week-92%

The biggest shift isn't in the numbers — it's in the team dynamic. Senior engineers stop spending their review time policing conventions and start spending it on architecture and design. Junior engineers get faster, more consistent feedback. And the AI-generated code that makes it through the gate is code the team can actually trust.

Getting started

Install the autter GitHub App, connect your repository, and your next pull request will be analysed automatically. The default rule set covers the most common AI-generated code issues out of the box.

# Or run locally before pushing
npx autter analyse --pr 142
 
# Preview what autter would flag in your current branch
npx autter check --diff HEAD~1..HEAD

autter analyses your first 100 PRs free. No credit card, no sales call, no configuration — just connect and see what it catches.

14-day free trial

Ship with confidence,
starting today

autter is the merge gate built for the AI coding era. Try it free for 14 days — no credit card, no commitment, full access to every feature.

  • AI-powered code reviews on every PR
  • 40+ linters & custom checks
  • No credit card required

Stop shipping code you can't fully trust

autter is the merge gate built for the AI coding era. It catches what linters miss, flags what CI ignores, and gives your team the confidence to ship faster without the 2am surprises.

50,000+

Pull requests analysed every week, catching issues that passed CI but would have failed in production.

73%

Fewer production incidents from AI-generated code after teams adopt the autter merge gate.

<90s

Average time to first review. autter analyses your PR before a human reviewer even opens it.

Capt. Patch

Capt. Autter Patch

Online now

I've seen a lot of codebases. Most teams find out they needed Autter after a bad deploy. What does your PR review process look like right now?

Powered by Autter AI