Popular artificial intelligence systems have a peculiar and persistent habit: when they invent names that do not exist, they tend to invent the same ones, again and again. An analysis and scoop by AI Insider has traced this recurring behavior to the statistical mechanics underlying how modern AI language models generate text.

A Pattern That Undermines Professional Trust

Hallucination—the technical term for when an AI generates false information presented as fact—is well documented. What has been less understood is why the false names AI systems produce are not random noise but instead follow a consistent, repeatable pattern across different users and queries. That predictability turns a quirk into a liability. A legal team relying on an AI research tool, a journalist using one to surface sources, or a business analyst checking company names needs to trust that fabricated outputs are at least random enough to be obviously wrong. When fake names recur reliably, they can begin to look credible through sheer repetition.

Statistics, Not Malfunction

AI Insider's analysis found that the answer lies in statistics. Large language models do not retrieve facts from a database; they generate each word by calculating the probability of what should come next, based on patterns absorbed during training. Certain names—shaped by how frequently combinations of sounds and letters appeared in training data—cluster near the top of the model's probability distribution. When the model has no accurate answer to draw on, it defaults toward those high-probability outputs. The result is that the same invented names surface repeatedly, not because the model is broken, but because it is doing exactly what it was built to do: produce statistically likely text.

The Commercial Implication

Understanding the cause shifts the conversation from "AI makes things up" to something more specific and addressable: AI makes up the same things, in predictable ways. That distinction matters for companies building products on top of AI systems, because predictable failure modes can, in principle, be tested for, filtered, and disclosed to end users. The businesses that treat hallucination as a known statistical property—rather than an unpredictable defect—will be better positioned to set accurate expectations and build appropriate guardrails around their deployments.