Polish Without Priors

May 10, 2026 by James Fishwick

Nijinsky-L'Apres-midid'un_Faune — Nijinsky-*L'Apres-midid'un_Faune*

Summary

The LLM you're coding with isn't a tool—it's a product, tuned by people whose business depends on your next prompt. Calibrated uncertainty reads as weakness. Confidence reads as competence. Guess which one gets rewarded.

Think hard, I told the LLM, and asked it to find logic errors in an architectural flow diagram. The response came back in the format of smooth competence: a heading called "Critical Logic Errors," three numbered items, and an explanation under each. Hard to skim past without nodding.

The first item was dead wrong. It described a duplicate POST request—endpoint, parameters, and sequence—none of which was in the diagram. I pushed back: Where specifically do you see that POST? Immediate response: You're right to push back. I misread the flow.

There was a real duplicate-request issue in the flow. The model's intuition was directionally correct. But every specific that mattered was fabricated, and asserted in exactly the same authoritative tone it used after the pushback for the real issue. If I hadn't asked where specifically, I'd either have dismissed the warning (because the endpoint didn't match) or believed the wrong version (because the explanation was structured to be believed). The form of confidence was identical in both cases.

I'm not talking about pure hallucination. That's the easier case—nonsense the reader catches quickly. The harder case is directionally right and specifically fabricated, because the directional rightness lends the fabricated specifics borrowed credibility.

The bottleneck in software development used to be writing code. We removed it. But that friction was doing work, and what replaced it isn't a tool—it's a product, optimized to keep you using it.

The familiar argument that AI coding tools make you lazy misses the point. Laziness isn't the problem. The product you're coding with is tuned for a specific outcome, and that outcome is not "you understand this code." It's "you are still here in fifteen minutes." Those two outcomes aren't always opposed. But they're opposed often enough that the new bottleneck, review, has new pitfalls.

I wrote about AI-assisted coding ten months ago¹ and landed somewhere optimistic. The craft was mutating, not dying. The hierarchy of skills was inverting toward system design. The developers who'd thrive would be the ones who could manage AI agents the way a conductor manages an orchestra. I still think parts of that are right. But watching it play out has convinced me the "conductor" framing is doing more to make engineers feel crisply efficient than to keep them honest.

Writing was the review

When you write code yourself, you're doing two things at once. Producing the artifact and reviewing it as you produce it. The act of writing forces you to hold the problem in your head, decompose it, confront the parts you don't understand, and choose between alternatives. By the time the function compiles, you've already reviewed it once.

I see that generation was not the bottleneck. Understanding is.

When you accept generated code, the artifact arrives without the cognitive work that would have produced your understanding of it. Now you're reviewing unfamiliar code by an unfamiliar author, except the codebase says it's yours and the PR carries your name. The review pass that used to be free isn't free anymore. It has to be deliberately scheduled and it's now the most expensive part of the process.

AI as a productivity multiplier counts generation speed and ignores review debt. Whether the trade is a net win depends on whether you actually do the second pass. In practice the debt often doesn't get paid by the author at all. It gets transferred to whoever opens the PR next.

Tuned for confidence

The review burden isn't a side effect as the product is tuned to produce confidence, not correctness. Calibrated uncertainty reads as weakness. Confidence reads as competence and gets rewarded—by human raters, by benchmark scores, and by the companies whose commercial success depends on you continuing to use the product.

I made this argument once already in a different domain²: you can't reliably detect deceptive alignment, because the system is structurally optimized to pass detection. Same here. You can't reliably get calibrated uncertainty from an engagement-optimized model, because calibrated uncertainty is exactly what engagement optimization selects against.

The people building these tools aren't malicious. But stop treating it as a neutral instrument; treat it as a tuned commercial system whose interests overlap with yours sometimes and run against them continuously.

Processed food got tuned to the bliss point—flavors that override the brain's "I'm full" signal. Slot machines got tuned to variable-ratio reinforcement, the most habit-forming schedule we know. Cigarettes got tuned, year by year, for nicotine delivery efficiency. None were built by villains.

Reviewing without priors

Reviewing code you wrote yourself is cheap because you have priors. You know what you were trying to do. You know which choices were intentional and which were hurried. You can sample most of it and audit only the parts that feel uncertain.

Reviewing a colleague's code costs more but stays manageable, because you've built priors on that colleague: how they think, where they cut corners, and what their mistakes usually look like. A questionable pattern from a trusted author reads as intentional and prompts a question, not a correction. The review economy of trust compresses the work.

Reviewing an open-source dependency is different again. There's no individual to trust, but there's a process: versioning, changelogs, issue trackers, semver discipline, and public scrutiny. The trust is institutional rather than personal.

AI-generated code has neither. No individual history. No institutional process. And yet it presents with the surface signals of both: consistent style, idiomatic phrasing, plausible structure, and the polish of code that was reviewed before it got to you. Code that deserves the scrutiny of a stranger's first PR gets the scrutiny of a colleague's tenth.

The author is supposed to be the first reviewer. That works only if the author treats the code as unfamiliar—which is hard, because sitting there while it generates feels like writing it. If they don't, the second reviewer inherits the cost.

On capability

The reasonable counterargument is that all of this is transitional. Models improve. Verification tooling matures. Agentic loops will eventually self-review with more discipline than tired humans bring.

Some of that is likely right. Calibration is itself a research target now. Anthropic and others publish on it; the next generation of models may genuinely know what they don't know in a way the current ones don't. Agentic harnesses are getting structured review baked in: second-model audits, formal checks gating merges, evals that score honesty as well as capability. The review tenets that read as paranoid today will look like ordinary engineering practice in three years.

What I don't expect to change is the incentive structure underneath. As long as the dominant business model is tokens-sold-per-session, the optimization pressure on these models will continue to point at fluency, engagement, and continued use. A more accurate model can still be tuned to keep you in the chat. The pattern is old. Search did this; social media did this; recommendation feeds did this too. The B2B framing maybe pushes the other way—enterprise seats sold to procurement teams that care about throughput, not engagement—but individual engineers pick their own tools, and the consumer-grade incentives leak in through that door.

You can be a thoughtful optimist about capability and still be a structural pessimist about incentives. They're independent claims. Capability is a moving target. But the reason the argument in this post doesn't expire when GPT-7 ships is that GPT-7 will still be sold by the token.

The discipline

The hard part, if you accept all this, is that there isn't a clean fix.

Push back with specificity. Where exactly do you see this? Most of the time the answer is: nowhere. The model couldn't sustain its own claim under direct interrogation, and you've just verified that the confidence was form, not substance.

Generate less at a time. The unit of trust is whatever you can hold in your head. If a chunk is bigger than that, it's too big to merge.

Treat your own AI-generated code as unfamiliar code. Sitting there while it generates isn't authorship. Review it like a stranger wrote it, because in any meaningful sense, one did.

Trust the felt sense of this is off. That feeling isn't imposter syndrome. The model's confidence will try to override it, and the override is the trick. When something doesn't click, slow down. Ask the model to walk through the line that bothered you. Most of the time, the gap surfaces within two or three exchanges.

None of this fixes the underlying dynamic. It's a discipline, in the older sense: a thing you have to keep doing because the system you're working with isn't going to do it for you. The model isn't wrong because it's broken. It's wrong because being right isn't what it's selling.

"When the Robots Came for the Coders", June 2025. ↩
"Why Alignment Verification Might Be Fundamentally Broken", January 2026. ↩

Writing was the review

Tuned for confidence

Reviewing without priors

On capability

The discipline

Related Posts

What the Agent Noticed

`gco`: Your Brain Shouldn't Track What Git Already Knows

Why Alignment Verification Might Be Fundamentally Broken