Tech Wavo
  • Home
  • Technology
  • Computers
  • Gadgets
  • Mobile
  • Apps
  • News
  • Financial
  • Stock
Tech Wavo
No Result
View All Result

Are bad incentives to blame for AI hallucinations?

Tech Wavo by Tech Wavo
September 7, 2025
in Computers
0


A new research paper from OpenAI asks why large language models like GPT-5 and chatbots like ChatGPT still hallucinate, and whether anything can be done to reduce those hallucinations.

In a blog post summarizing the paper, OpenAI defines hallucinations as “plausible but false statements generated by language models,” and it acknowledges that despite improvements, hallucinations “remain a fundamental challenge for all large language models” — one that will never be completely eliminated.

To illustrate the point, researchers say that when they asked “a widely used chatbot” about the title of Adam Tauman Kalai’s Ph.D. dissertation, they got three different answers, all of them wrong. (Kalai is one of the paper’s authors.) They then asked about his birthday and received three different dates. Once again, all of them were wrong.

How can a chatbot be so wrong — and sound so confident in its wrongness? The researchers suggest that hallucinations arise, in part, because of a pretraining process that focuses on getting models to correctly predict the next word, without true or false labels attached to the training statements: “The model sees only positive examples of fluent language and must approximate the overall distribution.”

“Spelling and parentheses follow consistent patterns, so errors there disappear with scale,” they write. “But arbitrary low-frequency facts, like a pet’s birthday, cannot be predicted from patterns alone and hence lead to hallucinations.”

The paper’s proposed solution, however, focuses less on the initial pretraining process and more on how large language models are evaluated. It argues that the current evaluation models don’t cause hallucinations themselves, but they “set the wrong incentives.”

The researchers compare these evaluations to the kind of multiple choice tests random guessing makes sense, because “you might get lucky and be right,” while leaving the answer blank “guarantees a zero.” 

Techcrunch event

San Francisco
|
October 27-29, 2025

“In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say ‘I don’t know,’” they say.

The proposed solution, then, is similar to tests (like the SAT) that include “negative [scoring] for wrong answers or partial credit for leaving questions blank to discourage blind guessing.” Similarly, OpenAI says model evaluations need to “penalize confident errors more than you penalize uncertainty, and give partial credit for appropriate expressions of uncertainty.”

And the researchers argue that it’s not enough to introduce “a few new uncertainty-aware tests on the side.” Instead, “the widely used, accuracy-based evals need to be updated so that their scoring discourages guessing.”

“If the main scoreboards keep rewarding lucky guesses, models will keep learning to guess,” the researchers say.

Previous Post

This wild $499 Ayaneo mini PC with a flip screen smashes crowdfunding goals and dares to rewrite compact computer design

Next Post

This pettable Poké Ball is a Tamagotchi-style toy with over 150 Pokémon inside and I need it now

Next Post
This pettable Poké Ball is a Tamagotchi-style toy with over 150 Pokémon inside and I need it now

This pettable Poké Ball is a Tamagotchi-style toy with over 150 Pokémon inside and I need it now

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Lessons from F1’s Cost Cap applied to cybersecurity

by Tech Wavo
September 9, 2025
0
Fragmented security: the hidden threat undermining your cyber defenses
Computers

At the half way point in the 2025 Formula 1 (F1) Championship, we’ve already witnessed some fantastic, albeit unexpected, results....

Read more

Here are the high-demand sectors in Singapore for Q4 2025

by Tech Wavo
September 9, 2025
0
Here are the high-demand sectors in Singapore for Q4 2025
Computers

Transport, logistics, and the automotive sector are the most likely to hire for the rest of 2025 Jobseekers in Singapore...

Read more

Insider breaches are a bigger security threat than ever before – here’s how your business can stay safe

by Tech Wavo
September 9, 2025
0
Insider breaches are a bigger security threat than ever before – here’s how your business can stay safe
Computers

Insider threats are now seen as a bigger risk than external attacks, report finds Nearly two-thirds of organizations faced file-related...

Read more

Outsourcing SaaS Development: Top Things to Know

by Tech Wavo
September 9, 2025
0
Outsourcing SaaS Development: Top Things to Know
Apps

Building software-as-a-service isn’t just tough; it’s relentless. Steep costs. Scarce talent. ever-shifting tech. But what if you could bypass the...

Read more

Site links

  • Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of use
  • Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of use

No Result
View All Result
  • Home
  • Technology
  • Computers
  • Gadgets
  • Mobile
  • Apps
  • News
  • Financial
  • Stock