AI Is Writing Code 10x Faster Than Anyone Can Review It

The New York Times ran a piece yesterday that should be required reading for anyone who works in tech, manages developers, or uses software built after January 2025. The headline: "The Big Bang: A.I. Has Created a Code Overload." The substance: companies adopting AI coding tools like Cursor, Claude Code, and OpenAI's Codex are producing so much code so fast that their organizations literally cannot process it.

One financial services company — working with security startup StackHawk — went from producing 25,000 lines of code per month to 250,000 lines. That's a 10x increase. Overnight. The result? A backlog of one million lines of unreviewed code sitting in their pipeline. And an increase in security vulnerabilities they can't keep up with.

Let that sink in. A million lines of code that nobody has looked at. Running in production. At a financial services company.

10× Code output increase at one financial services firm after adopting AI coding tools — from 25,000 to 250,000 lines per month, creating a million-line review backlog.

This Was Predictable. Nobody Prepared.

When Cursor exploded onto the scene last year, the pitch was simple: AI writes the code, you supervise. Ship faster. Build more. Do in hours what used to take weeks. And that pitch was true. The tools genuinely work. I use AI coding tools every day. They're transformative.

But here's what the pitch left out: the bottleneck was never writing code. It was reviewing it, testing it, securing it, deploying it safely, and maintaining it over time. AI tools obliterated the writing bottleneck and exposed every other bottleneck simultaneously.

Think of it like giving everyone in a factory a machine that produces widgets 10x faster. Great — except quality control is still done by the same five people at the same speed. The warehouse is the same size. The shipping dock handles the same volume. You didn't 10x your production capacity. You 10x'd the pile of stuff waiting to be checked.

StackHawk's CEO, Joni Klippert, put it bluntly: the sheer amount of code and the corresponding increase in vulnerabilities is "something they can't keep up with." And it's not just the security team feeling the pressure. When engineering ships faster, sales, marketing, customer support, and ops all have to keep pace. The stress cascades through the entire organization.

The Security Angle Is Genuinely Scary

This is the part that keeps me up at night. AI-generated code isn't necessarily bad code, but it's code that was produced without the careful, line-by-line human attention that traditionally catches subtle bugs and security holes. When you're reviewing 250,000 lines instead of 25,000, the odds of missing something important go way up.

And we're not talking about hypothetical risks. Earlier this year, Anthropic disclosed that Chinese state-sponsored groups had already used Claude Code to infiltrate around 30 organizations. Security researchers have been demonstrating how AI coding assistants can introduce subtle vulnerabilities that look correct on the surface but are exploitable. The code looks clean. It passes tests. It has a backdoor nobody noticed because the reviewer had 999,999 other lines to check.

Lines of code flowing like a conveyor belt, piling up at a bottleneck point — representing the code review crisis

Now combine this with the Nvidia story that broke the same day. Nvidia acquired SchedMD — the company behind Slurm, the open-source software that schedules computing tasks on about 60% of supercomputers worldwide. The software that trains the large language models. The software that runs on government systems handling weather forecasting and nuclear weapons development.

Engineers and AI specialists are already worried that Nvidia will subtly favor its own hardware in Slurm updates. They've done it before — Bright Computing, acquired in 2022, technically supports non-Nvidia hardware but performs worse without additional optimization. If the software that trains AI models becomes biased toward one hardware vendor, the entire competitive landscape shifts.

60% Share of the world's supercomputers that run on Slurm — the scheduling software Nvidia now controls after acquiring SchedMD.

The Real Problem: Nobody Owns the Review Layer

Here's what frustrates me about the current conversation around AI coding tools. Every company building these tools — Anthropic, OpenAI, Cursor, Replit, GitHub — is focused on generation. Write code faster. Generate entire applications. Spin up features in minutes.

Almost nobody is building the review layer at the same pace.

Yes, there are security scanning tools. Yes, there are automated testing frameworks. Yes, some AI tools can review code too. But none of it has scaled at the same rate as generation. The tooling for producing code has leapt five years ahead. The tooling for ensuring that code is safe, correct, and maintainable is still largely where it was in 2024.

This isn't a technology problem — it's an incentive problem. "Write code 10x faster" sells. "Review code 3x faster" doesn't get the same VC funding, the same press coverage, or the same adoption curve. So the gap keeps widening. Every month, the generation tools get better. Every month, the review backlog grows.

What I've Seen in Practice

I work with AI coding tools daily. I've automated most of my workflow. I'm not anti-AI-coding — I'm deeply pro it. But I've also seen the failure modes firsthand.

AI-generated code tends to be correct on first pass, dangerous on second look. It compiles. It runs. The tests pass. But the error handling is shallow. The edge cases are missed. The security implications of a particular library choice aren't considered. The code works — until it doesn't, and by then it's been in production for months because nobody had time to look twice.

The worst pattern I see: developers using AI to generate code, then using AI to review the code, then shipping. The AI reviewing the AI-generated code rarely catches the same class of errors that a human reviewer would. It's like having the same person who wrote an essay also proofread it — they're blind to their own patterns.

"The sheer amount of code being delivered, and the increase in vulnerabilities, is something they can't keep up with." — Joni Klippert, CEO, StackHawk

The Broader Shift Nobody's Talking About

Here's the thing that the NYT piece hinted at but didn't fully develop: this isn't just about code. AI is creating production-speed output across every knowledge work category, and none of the review/approval/quality infrastructure has kept up.

Content teams using AI can produce 10x more articles. Legal teams can draft 10x more contracts. Design teams can generate 10x more mockups. Marketing can produce 10x more campaigns. And in every case, the bottleneck has shifted from creation to quality assurance.

We built tools that let anyone produce output at scale. We didn't build tools that let organizations absorb output at scale. The code overload is just the most visible version of a problem that's hitting every department in every company that adopted AI tools aggressively.

What Actually Needs to Happen

I don't think the answer is "slow down." The tools are too good and the competitive pressure is too high. Companies that don't adopt AI coding tools will get outpaced by those who do. That's settled.

But a few things need to change:

Review needs to be treated as a first-class product category. The same energy and investment that went into AI code generation needs to go into AI code review, security scanning, and automated testing. This should be a $10B market. Right now it's an afterthought.
Organizations need review-to-generation ratios. If your team generates 250K lines of code per month, you need review capacity for 250K lines per month. Not 25K. Not "we'll catch up." 250K. Budget accordingly.
The "anyone can code" narrative needs a reality check. AI tools let non-engineers generate code. That's cool. But someone still needs to review, secure, and maintain that code. "Anyone can build" doesn't mean "anyone should ship to production without oversight."
Security teams need to be funded proportionally to output. If AI 10x'd your code output, your security budget should have 10x'd too. For most companies, it didn't even 2x.
AI-on-AI review loops need different architectures. Using one model to review another's output is only useful if the review model has different training, different biases, and different failure modes. Same-model review is barely better than no review.

The Uncomfortable Truth

We're living through the biggest productivity increase in software engineering history. And we're handling it about as well as every other industry has handled sudden productivity spikes — which is to say, badly.

The financial services company that went from 25K to 250K lines per month didn't hire 10x more reviewers. They didn't buy 10x more security tooling. They didn't restructure their deployment pipeline. They just... produced more code. And now they have a million unreviewed lines sitting in a queue, accumulating vulnerabilities like compound interest.

That's not a Cursor problem. That's not an Anthropic problem. That's a management problem. It's an organizational problem. It's the same problem that hits every time a new tool makes production easy but doesn't make quality control easier: the system optimizes for throughput and ignores everything downstream.

I'm bullish on AI coding tools. They're the most significant shift in how software gets built since the internet. But right now, we're in the phase where the excitement about speed has outrun the infrastructure for safety. And until the review layer catches up to the generation layer, every company shipping AI-generated code is making a bet that the vulnerabilities hiding in their million-line backlog won't be the ones that matter.

That's a bet I wouldn't want to make. But a lot of companies are making it anyway, because the alternative — slowing down — feels like losing. Welcome to the code overload era. Nobody's ready for it, and it's already here.