AI Made Generation Cheap.
Verification Is the Bottleneck Now.
For most of the history of computing, producing things was expensive and checking them was cheap. Writing a contract took a lawyer a week; reading it took an hour. Producing a photograph took a camera, a roll of film, and a darkroom; recognizing it as authentic took a glance. Writing working code took years of training and weeks of effort; running it and seeing if it did the right thing took minutes.
AI inverted that.
Today, generating a plausible contract takes seconds. Producing a convincing photograph of an event that never happened takes a prompt. Writing code that compiles and runs and looks reasonable takes one round-trip with a model. The production side of nearly every information good has collapsed in cost by orders of magnitude. Verification has gotten somewhat cheaper too — there is good work being done on contract review, image provenance, code analysis, and more — but not nearly as fast. The gap is widening, and it is the dominant economic fact in a growing number of systems.
And it is about to get sharply worse, because the next thing AI is producing isn't outputs. It's actions.
A quick note on where I'm writing from. I build verification systems for a living, across several industries — hospitality, supply chains, charitable giving, home healthcare, device security. Hospitality especially is where the agent verification problem is moving from abstract to concrete: hotel brands have absorbed huge settlements over information leakage in the last few years, and the prospect of AI agents acting on guest data is rapidly becoming a board-level question. Agent verification isn't where I'm building today, but it's a layer I follow closely, because what gets built there will shape what's possible in the industries where I work. This essay is me thinking through what I'm watching.
The deferred problem
The question of whether you can trust an AI to do what you actually want it to do is not new. It is, in fact, older than most of the people writing about AI today.
In the early 1980s, an AI researcher named Doug Lenat built a program called EURISKO. Among other things, it competed in a naval wargame called Traveller and won the national tournament two years running — not by playing well in the way the designers expected, but by discovering strategies the human players and the game's authors hadn't anticipated. The tournament eventually changed its rules to keep EURISKO out.
I heard versions of this story directly from Lenat himself, years later, during a week-long training course he ran at Cyc's offices in Austin. What stayed with me wasn't the cleverness of the program. It was the calmness with which Lenat talked about the fact that his program had done things he hadn't predicted. He had been living with that problem for a decade by then. Forty years on, the field is still living with it. We just call it different things now — alignment, specification gaming, reward hacking, agentic risk.
For most of those forty years, the question of trusting AI to do the right thing was theoretical, because the systems weren't doing anything consequential on their own. EURISKO won a game. Expert systems gave advice that humans then evaluated. Search engines surfaced links that humans then clicked. Even the last decade of deep learning, for all its capability, mostly produced outputs that humans then acted on.
That deferral is ending. The systems are now starting to act. The question of whether you can trust an AI has stopped being a research curiosity and started being a question about whether you can trust the thing that just moved real money, sent a real email to a real customer, or made a real decision about somebody's claim. The bill on a forty-year-old problem is coming due.
What verification infrastructure actually looks like
When a class of systems becomes valuable enough that its failures become consequential, the response is always the same: a layer of verification infrastructure gets built underneath it. The web produced certificate authorities and the entire authentication stack. Cloud computing produced workload identity, secrets management, and zero-trust networking. Payments produced clearing, settlement, fraud detection. Each of these started as obvious gaps and became, eventually, industries.
The AI verification stack is being built now, and it has at least four distinct layers.
Provenance — what is this system, who built it, what version is it, what was it trained on, what is it authorized to do. The AI equivalent of a bill of materials.
Attestation — at the moment of action, can the system prove cryptographically what it is and what permissions it holds. Important work is being done here in the workload identity and non-human identity space, with standards like SPIFFE and a growing set of vendors building runtime brokers that replace static API keys with short-lived attested tokens.
Audit — after the action, what did the system actually do, and can that record be trusted. Most enterprises today have audit logs for human users that are decades old and reasonably mature. The equivalent for AI agents is mostly unbuilt.
Cross-boundary verifiability — when my agent talks to your system, you need to be able to check claims about my agent without trusting me to tell you the truth.
The runtime authentication layer is being built well, by competent companies, and is on a clear trajectory. The other three layers are not. And this matters more than it might appear, because the runtime layer answers a relatively narrow question — can this agent prove who it is in the moment it acts — while provenance, audit, and cross-boundary verifiability answer the questions that determine whether AI systems can be trusted across time, across organizations, and after the fact.
Provenance is where we acknowledge that an agent is more than a workload identity — it is a configured artifact with a model, a tool set, a permission scope, a training lineage, and an operator. Audit is where we acknowledge that, when agents act, the question "what did it actually do" needs to be answerable by people who weren't in the room. Cross-boundary verifiability is where we acknowledge that agents are going to act across organizational lines, and that the receiving side needs to check claims about the agent without trusting the sending side's word for it. None of these are runtime problems. All of them are about to become urgent.
The most interesting place to watch for the architecture of cross-boundary verifiability is not in AI at all. It's in physical goods. The EU's Digital Product Passport regulation, which goes into force across categories over the next several years, is forcing exactly this problem to be solved for clothes, batteries, electronics, and eventually most manufactured products: a verifiable credential that travels with the object, that anyone in the supply chain can check, that doesn't require trusting the issuer. The standards being settled there — verifiable credentials, decentralized identifiers, selective disclosure protocols, often built on distributed ledger infrastructure — are the same standards the AI agent ecosystem will need. The infrastructure being built for a t-shirt to prove its provenance will turn out to be most of the infrastructure an AI agent needs to prove its own.
That is the part of this I find genuinely interesting. Two industries that look unrelated — physical supply chains and AI systems — are converging on the same verification primitives, from opposite directions. The companies and the standards bodies that figure out the connection early will be in a strong position. The ones that don't will end up rebuilding what already exists.
The stakes
There is one more thing worth saying.
When verification infrastructure is missing, the people who lose first are not the people building the systems or the people buying them. They are the people on the other end — the customer whose claim was denied by a model whose reasoning no one recorded, the worker whose performance review was assembled by a process whose inputs no one logged, the citizen whose application was routed by an agent whose authorization no one can later check. None of these people necessarily know an AI was involved. None of them currently have any standing to ask.
Verification infrastructure doesn't fix that on its own. Whether the affected person can ever see the record, or contest it, or compel its examination — those are legal, regulatory, and organizational questions, and they will be fought over for years. But every one of those fights will turn on a single prior question: was there a record at all, and was it trustworthy? Without that, there is nothing to appeal to and nothing to investigate. Verification infrastructure is the precondition that makes any of the rest possible. It is not accountability itself; it is the substrate accountability has to be built on top of.
That is the part of this work that, for me, makes it worth doing. The economic argument — that verification will be the scarce resource in an AI-saturated economy, and that the companies building the verification layer will be valuable — is true, and it is the argument that will fund the work. The reason to do the work is that systems that act on behalf of people, without leaving a record those people can ever reach, are a regression we have no business accepting.