The 60% claim, deconstructed

On Airbnb’s Q1 2026 earnings call, Brian Chesky said AI now writes 60% of the company’s new code, “about twice the industry average.” The number traveled fast, the way round numbers from name-brand companies do. It is also almost certainly true and almost entirely useless as stated, which is a combination worth taking apart carefully, because we are going to hear a lot more numbers shaped exactly like this one.

I did this once before with the 10x-faster claim. Same method here, because the same problems show up.

”Writes” is doing enormous work

“Written by AI” is not a defined quantity. A tab-completion you accepted is written by AI. A function the agent drafted that you then rewrote is, by most measurement pipelines, still written by AI. Generated serializers, migrations, and test fixtures are written by AI. Does a line you accepted and then edited count? In nearly every pipeline that produces a number like this, the answer is yes: it counts characters that originated from a model suggestion and survived to a commit. By that definition 60% is entirely plausible and tells you almost nothing about how much thinking was delegated, because the easy 60% and the hard 5% each count as exactly one line.

The denominator is unstated

Sixty percent of what. New feature logic? Generated clients and database migrations? Test scaffolding? A real codebase produces an enormous volume of mechanical code — types, fixtures, boilerplate — that was always going to be emitted by something. Moving that production from a snippet library and an ORM generator to a model moves the percentage a great deal and the actual engineering very little. Without the denominator, 60% is a number you cannot reason about, only repeat.

”Twice the industry average” is two unknowns

The comparison is weaker than the headline. “About twice the industry average” multiplies one undefined number by a second one. There is no audited, standardized industry average for the share of code written by AI — the figure is assembled from self-reported vendor surveys and company press lines, each using its own definition of “written.” Doubling a number that is itself a range of incompatible measurements does not produce a fact; it produces a bigger range. The sentence sounds like a benchmark and is actually two estimates stacked on each other, and the stacking is the part that gets quoted.

The tell is in his own sentence

Here is the part I find most useful, and it is not the 60%. From the same call, Chesky’s own framing: where you “might have needed a team of 20 engineers before, an engineer can now spin up agents to do a lot of work under supervision.” Under supervision. The 60% figure silently includes the cost of the exact thing that makes it safe, and does not price it. Every one of those AI-written lines had to be read by someone whose judgment is the actual bottleneck. The number counts the output and not the supervision, and the supervision is the expensive part. A productivity figure that excludes its own largest cost is a marketing figure, even when it is true. A company that has reorganized twenty-engineer teams around agent supervision has, by its own description, moved its scarcest resource from writing code to reviewing it — and then reported only the half of that trade that flatters.

What is actually true in it

The direction is real and worth saying plainly so this does not read as dismissal. Airbnb is doing serious agentic work and getting enough value from it to put it on an earnings call, which companies do not do with things that are not working. The honest reading of “60%” is: a majority of the characters they commit now originate from a model, and they have reorganized how engineering work happens around supervising that. That is a real and significant operational change. It is simply not the same sentence as “we are 60% more productive,” or even “60% of the engineering is done by AI” — and those are the sentences listeners actually take away.

Here is the concrete version of the gap. An agent that scaffolds a CRUD endpoint, its serializer, its fixture, and its test produces a great deal of AI-written code and almost no delegated engineering — that code was always going to be mechanical, and the hard decision (what the endpoint should do, and what it must never do) was made by the human before the agent typed anything. The 60% counts the scaffold at full weight and the decision at zero, because the decision is not a line of code. That is the precise inversion of where the value actually is.

How I would report it

If it were my call, I would not give a percentage of code at all. I would give the number that is hard to game: of the pull requests we merged to production last quarter, what fraction were authored primarily by an agent and shipped without a human rewriting the core of them — and what did our revert rate and median review time do over the same window. That number is smaller, far less quotable, and actually means something, because it counts work that survived contact with review instead of characters that survived to a commit. The concrete way to compute it for your own team is in measuring your own AI-code share.

The reason this matters past one earnings call is that numbers like 60% are about to set expectations for every engineering org, and they all share Airbnb’s omission: they count what the agent produced, never what it cost to check or what it pulled in to produce it. The cost side is the next thing worth looking at directly, and it turns out to be sharper than a supervision tax — it is a security bill. That is your agent’s config is the attack surface now. The output number and the risk number are the same act of delegation, viewed from opposite ends.