Spend enough time with ChatGPT and other artificial intelligence chatbots; they’ll tell you to lie in no time.
It’s now a challenge for every business, organization, and high school student trying to get a generative AI system to produce documents and get work done. It’s been described as hallucination, confabulation, or just plain making things up. Some people use it for high-risk jobs, such as counseling, researching, and writing legal pleadings.
“I don’t think there’s any model today that doesn’t suffer from some hallucination,” said Daniela Amodei, co-founder and president of Anthropic, which created the chatbot Claude 2.
“They’re really just sort of designed to predict the next word,” said Amodei. “As a result, there will be a rate at which the model does that incorrectly.”
Anthropic, OpenAI, and other prominent creators of AI systems known as large language models aim to improve their accuracy.
It remains to be seen how long this will take and whether they will ever be good enough to securely provide medical advice.
“This isn’t fixable,” said Emily Bender, a linguistics professor and director of the Computational Linguistics Laboratory at the University of Washington. “It’s inherent in the mismatch between technology and proposed use cases.”
Anthropic, OpenAI, and other prominent creators of AI systems known as large language models aim to improve their accuracy.
Much depends on the dependability of generative AI technologies. According to the McKinsey Global Institute, it will add $2.6 trillion to $4.4 trillion to the global economy. Chatbots are only one component of this frenzy, including technologies capable of creating fresh images, videos, music, and computer code. Almost all of the technologies have a linguistic component.
Google is already pitching a news-writing AI solution to news organizations where precision is critical. The Associated Press is also looking into using the technology as part of a collaboration with OpenAI, which is paying for access to a portion of the AP’s text archive to improve its AI systems.
Ganesh Bagler, a computer scientist, has been working with India’s hotel management colleges for years to get AI systems, including a ChatGPT predecessor, to generate recipes for South Asian delicacies, such as unique variants of rice-based biryani. A single “hallucinated” component might mean the difference between a delicious and inedible meal.
When OpenAI CEO Sam Altman visited India in June, a professor at the Indraprastha Institute of Information Technology Delhi had some questions.
“I guess hallucinations in ChatGPT are still acceptable, but when a recipe comes out hallucinating, it becomes a serious problem,” Bagler remarked, rising in a full campus auditorium to address Altman on the New Delhi leg of the American tech executive’s world tour.
“What’s your take on it?” Bagler eventually inquired.
Altman showed hope if a firm commitment was needed.
“I think we’ll get the hallucination problem a lot better,” Altman said. “It will take us a year and a half, if not two years.” That sort of thing. But we won’t be discussing these at that point. There is a trade-off between originality and precision; the model must learn when you want one or the other.”
However, some specialists who have studied the technology, such as University of Washington linguist Bender, believe there needs to be more than these advancements.
According to Bender, a language model is a system for “modeling the likelihood of different strings of word forms” given some written data on which it has been trained.
It’s how spell checkers know when you’ve typed the erroneous term. It also powers machine translation and transcription services by “smoothing the output to look more like typical text in the target language,” according to Bender. Many individuals rely on a variant of this technique when they utilize the “autocomplete” option in text messages or emails.
The latest generation of chatbots, such as ChatGPT, Claude 2, and Google’s Bard, attempt to take it a step further by producing entire new passages of text, but according to Bender, they’re still just selecting the most likely next word in a string.
Language models are “designed to make things up” when used to generate text. “They only do that,” Bender explained. They are skilled in imitating writing styles such as legal contracts, television scripts, and sonnets.
“However, because they only ever make things up, the fact that the text they have extruded happens to be interpretable as something we deem correct is by chance,” Bender explained. “Even if they can be tuned to be right more of the time, they will still have failure modes — and the failures will most likely be in cases where a person reading the text is less likely to notice, because they are more obscure.”
According to Shane Orlick, the company’s president, these inaccuracies are relatively minor for marketing agencies who use Jasper AI to help them write pitches.
“Hallucinations are actually an added bonus,” stated Orlick. “We have customers who tell us all the time about how it came up with ideas — how Jasper created takes on stories or angles that they would never have thought of themselves.”
To provide its customers with an array of AI language models tailored to their needs, the Texas-based business collaborates with partners, including OpenAI, Anthropic, Google, and Facebook parent Meta. It may deliver Anthropic’s model to someone concerned about accuracy, while someone concerned about the confidentiality of their private source data may receive a different model, according to Orlick.
Orlick admitted that he understands hallucinations are difficult to treat. He expects corporations like Google, which he argues must maintain a “really high standard of factual content” for its search engine, to invest significant time and resources in finding answers.
“I think they have to fix this problem,” Orlick remarked. “They must address this.” So it’ll never be perfect, but it’ll improve over time.
Techno-optimists, like Microsoft co-founder Bill Gates, have predicted a bright future.
“I’m optimistic that, over time, AI models can be taught to distinguish fact from fiction,” Gates wrote in a July blog post about the societal implications of AI.
He noted OpenAI’s 2022 paper as an example of “promising work on this front.” Recently, researchers at the Swiss Federal Institute of Technology in Zurich announced the development of a mechanism to detect and automatically eliminate some, but not all, of ChatGPT’s hallucinated content.
Even Altman, who markets the items for various purposes, relies on something other than the models to be accurate when looking for information.
“I probably trust the answers that come out of ChatGPT the least of anyone on Earth,” Altman joked to the audience at Bagler’s University.
SOURCE – (AP)