• Acronym
  • Posts
  • BTS: Generative AI's truth problem

BTS: Generative AI's truth problem

Hallucinations are up. What does that mean for you?

In partnership with

As a journalist, it’s only natural I urge folks to consider the source (and I don’t necessarily mean the musical group, though they are pretty rad).

Many of us increasingly rely on generative AI for our information (that AI summary at the top of a Google search results page, ChatGPT, even the self-sufficient chatbot on a company website). But the facts show an alarming trend: Today’s more complex gen AI systems hallucinate (aka make shit up) at a higher rate.

Before I get into the juice, here’s a quick ad for Morning Brew. I make $ when you click it. If you want to see Acronym thrive, you can also share it with your friends or tip me. I’m always open to suggestions! 🤗

The easiest way to stay business-savvy.

There’s a reason over 4 million professionals start their day with Morning Brew. It’s business news made simple—fast, engaging, and actually enjoyable to read.

From business and tech to finance and global affairs, Morning Brew covers the headlines shaping your work and your world. No jargon. No fluff. Just the need-to-know information, delivered with personality.

It takes less than 5 minutes to read, it’s completely free, and it might just become your favorite part of the morning. Sign up now and see why millions of professionals are hooked.

The New York Times reported that some models hallucinate at a rate as high as 79%. 🤯 At OpenAI, the company behind ChatGPT, their o4-mini model hallucinated 44% of the time, compared to 33% for its earlier o3 model. This is despite OpenAI’s regular evaluations to try to curb the problem.

Jason Hardy, chief technology leader of AI for global data leader Hitachi Vantara, calls this "the AI paradox" — as systems grow more sophisticated, their reliability decreases.

Jason Hardy, CTO of AI for global data leader Hitachi Vantara

My first thought when contextualizing the AI paradox was: I don’t get the luxury of being so damn wrong in my work, so why does AI?

But once I got past that feeling, I became more worried about the lack of truth-telling in a time when trust in legacy media has plummeted (while gen AI simultaneously fails to maintain the same factchecking rigor).

Digging for the truth

Last year, I covered an event where OpenAI CEO and founder Sam Altman spoke about how they don’t really understand how their models work. “If you don’t understand what’s happening, isn’t that an argument to not keep releasing new, more powerful models?” asked Nicholas Thompson, CEO of The Atlantic. Altman responded that even without that full cognition, “these systems [are] generally considered safe and robust.”

But in the case of worsening hallucinations, that safety might have to be reevaluated. Or, on the user front, we might have to become more hallucination-cognizant, understanding the urgent need to track sources and get real information.

Whether you’re using AI in your free time or at work, it’s imperative to understand that this technology is still in its puberty stage. Don’t take information at face value and seek the source of information as much as possible.

Research from Carnegie Mellon School of Computer Science suggests AI models can be steered towards truthfulness with the right language, but “even when steered to be truthful, there remains a risk of models lying.” 

On the contrary, our language can also steer a model towards falsification, showing just how malleable this technology really is. Carefully curating your prompts as you communicate with these models can be a game changer, one way or the other.

Can we fix the hallucination problem? 

Truth-telling is a major process to reconcile colonized Australia with its native aborigine cultures. Technology as a source of truth—whether real or imagined—must also face a reckoning, particularly in an era where information spreads like wildfire and can impact real humans in the process (just look at how incidents posted on social media led to Trump federalizing the national guard to combat anti-ICE protests).

Some experts say hallucination is ingrained in gen AI, but Hardy sees the problem more clearly: poor data quality.

As enterprises invest in these expensive AI systems, their data is meant to ground AI responses in factual, relevant data. Through poor data quality, these systems are instead amplifying AI fabrications, effectively gaslighting their data.

So how can companies implementing AI be a part of the solution? Hardy recommends the following:

  • Prioritize information accuracy and consistency as a prerequisite (not an afterthought) of any AI implementation.

  • Conduct sweeping data audits to identify and categorize data.

  • Allow some level of imperfect data as long as the information related to your use case is fine-tuned (this keeps you from becoming overwhelmed and failing to make any real progress).

Meanwhile, new technology is popping up to work on the problem at a systemic level, too. Startup company Hirundo calls itself a “machine unlearning” technology, empowering businesses to make AI models “forget” problematic data and behavior, resulting in up to 55% less hallucinations and 70% reduction in AI bias for more trustworthy AI.

A recent Hitachi Vantara report says just 36% of IT leaders regularly trust AI outputs, but only 38% of organizations are actively improving the quality of their training data. At the organizational level, this gap in trust and action needs to close — and at the individual level, we must remember to (you guessed it) consider the source.

Thanks,

Reply

or to participate.