Google and Microsoft are working to improve their chatbots’ responses by training them with powerful large language models (LLM). ChatGPT maker OpenAI has also announced that it has trained a model to detect hallucinations.
What is AI hallucination?
AI hallucinations occur when AI-powered models like ChatGPT or Google Bard fabricate information and present them as facts. Recently, ChatGPT cited ‘bogus’ cases in a New York federal court filing. In fact, during Bard’s presentation, the chatbot gave away wrong information regarding the James Webb telescope.
“Even state-of-the-art models are prone to producing falsehoods – they exhibit a tendency to invent facts in moments of uncertainty. These hallucinations are particularly problematic in domains that require multi-step reasoning, since a single logical error is enough to derail a much larger solution,” OpenAI researchers said.
The Microsoft-backed company said that mitigating hallucinations is a critical step towards building aligned artificial general intelligence (AGI) — a machine that can understand or learn intellectual tasks like human beings.
AI models to reward themselves
“We’ve trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”),” the company said in research published this week.
In simpler words, OpenAI wants to train AI models to reward themselves for each individual correct step of reasoning and not just for the correct answer. OpenAI said that the model boosts performance and directly trains the model to “produce a chain-of-thought that is endorsed by humans.” This means that supervision encourages the model to follow a human-approved process.
“We can train reward models to detect hallucinations using either outcome supervision — which provides feedback based on a final result — or process supervision — which provides feedback for each individual step in a chain-of-thought,” OpenAI research noted.
OpenAI has released an accompanying dataset of 800,000 human labels it used to train the model mentioned in the research paper, Karl Cobbe, mathgen researcher at OpenAI, told CNBC. The research team also said that the process-supervised reward model performs better across the board.
FacebookTwitterLinkedin
end of article