A dashboard displaying analytics charts that are dissolving into static

Back in August I wrote about simulation bias, the idea that organizations drift from ground truth when they start trusting representations of reality more than reality itself. I framed it as a future problem, something PMOs should start thinking about as AI-mediated insights become the norm. I figured we had some runway.

But, we don’t. At least if we take the below at face value.

The Post

A post showed up on r/analytics recently: “We just found out our AI has been making up analytics data for 3 months and I’m gonna throw up.”

tl;dr - A company had been using an AI agent since November to answer leadership questions about metrics, and by all accounts it was working great. Then three months in, someone asked the poster to double-check a number. They pulled the thread and found out the AI had been hallucinating the entire time, just inventing plausible-sounding percentages out of thin air.

By that point their VP of Sales had already restructured territories based on numbers that didn’t exist, and their CFO had presented a deck to the board built on fake insights. Months of strategy anchored to nothing. They only caught it because someone happened to ask for a sanity check.

The Speed Problem

When I wrote the original post, I used examples like watermelon programs and the Challenger disaster to show how representations of reality can quietly replace reality itself. Those situations took YEARS to develop. Humans filtering bad news up the chain, reporting cadences going stale, a Monday snapshot that’s outdated by Friday. You could usually catch it if you were paying attention.

What happened to this company took three months.

The hallucinated numbers weren’t vague or obviously wrong. An actual human analyst would’ve said “I’m not confident in this number” at some point (the OP doesn’t state if they are the analyst themselves). The AI never did. And because the output felt more precise than what people were used to getting from a person, nobody questioned it.

Blast Radius

A VP restructures sales territories, and that presumably touches quotas, hiring plans, comp, customer relationships. A CFO presents this to the board, and that shapes budget approvals, investor confidence, strategic direction. Every one of those decisions cascades into downstream actions taken by people who never saw the original data and have zero reason to doubt it. Yikes

Unwinding this is a mess. Retracing months of decisions and figuring out which ones were built on real data and which ones were built on air. Some of those are probably not reversible at this point. Not knocking sales, but what if this was in a more life-critical industry?

Bad data is a problem you can point to and fix. What happened here is that an organization stopped verifying and started trusting the thing intermediating their reality, their AI agent, and once that happened, the feedback loops that would normally surface errors just went quiet.

A human was in the loop. But more like in the loop of a rollercoaster than any semblance of actual control.

What Should Have Been in Place

I keep coming back to the same things I wrote in August:

  • Someone needed to be spot-checking AI-generated metrics against actual source data. Not once during setup, always. Especially the person using AI in the workplace.
  • Leadership needed to understand that just because a human is providing you an answer, there is now a potential AI behind that answer. This is converging but a system is nonetheless needed to align to the ground truth to account for “human in the loop” failure (see point 1)
  • Three months of confident answers with zero caveats should have made someone uncomfortable, not impressed.

None of this is new. It’s the same discipline project, program, and portfolio managers have always needed. The difference is that AI output is polished and decisive enough that skipping the verification step feels reasonable to the people responsible for mediating the data, and it isn’t.

This Will Keep Happening

The part that actually bothers me is that they only caught it by accident. If nobody had asked for that sanity check, how long does the simulation run? Another quarter? A full fiscal year? …13.6 billion years?

I’d bet there are orgs right now operating on hallucinated metrics and they just don’t know it yet.

So what does this mean? If you sit between where data lives and where decisions get made, this is the job now. Auditing the connection between what your tools say is happening and what is actually happening. When that connection breaks and nobody notices, you get three months of a company running on a simulation, and the longer it runs, the harder the cleanup.