"Determinism" is the Biggest Cope in AI Adoption

We’ve never had determinism in software. We just had the illusion of it.

Here’s a fact that most people outside computer science don’t know: in 1936, Alan Turing proved that there is no way to build a program that can check whether another program will even finish running. This is the Halting Problem. A few years later, Rice’s theorem took this further — Henry Gordon Rice proved that it is mathematically impossible to build a tool that can verify any meaningful property of software in the general case. Not hard. Not expensive. Impossible.

This means “make sure it doesn’t make a mistake” software was never a guarantee anyone could offer. Every piece of software you trust today shipped with that same uncertainty.

So when someone says “I can’t use LLMs in production because they’re nondeterministic — we need to build deterministic workflows where making no mistakes is the baseline expectation,” they’re confusing repeatability with correctness. A deterministic program that returns the wrong answer returns it every single time. That’s a bug, not a baseline expectation.

Manufacturing figured this out decades ago. Six Sigma doesn’t demand zero defects — it defines an acceptable defect rate and builds measurement systems to stay within that bound. The discipline was never “eliminate all variation.” It was “define, measure, analyze, improve, control” — continuously reducing variation. That’s evaluation, not determinism.

What AI systems shift is the evaluation surface. Instead of “does this code path execute as specified,” you ask “does this output meet our evaluation criteria.” The work moves from pre-deployment code verification to continuous evaluation.

In AI and machine learning systems, we reduces entropy (chaos) through evaluation.

This is not new. TCP connects on unreliable networks. RAID clusters operate on top of failing drives. AI models are trained on failing GPUs. We’ve always built reliable systems from unreliable components.

It was never about determinism. It was always about evaluations. If this resonates, I go deep on evaluation design in my book.