Microsoft Bing AI made several errors in launch demo last week

Microsoft CEO Satya Nadella

Jordan Novet | CNBC

During the chatbot hype last week, with Microsoft and Google trying to beat each other in presenting an early version of the search with artificial intelligence, more than 1 million people signed up to try Microsoft’s tool in the first 48 hours, the company said.

Microsoft CEO Satya Nadella told CNBC that the technology, which can spit out complete answers that read as if they were written by humans, is “perhaps the industrial revolution brought to the work of knowledge.”

But for those concerned about accuracy, the AI ​​leaves a lot to be desired.

In Microsoft’s demo in front of journalists, technology like ChatGPT embedded in the company’s Bing search engine analyzed income reports from gap and Lululemon. When comparing the answers to the actual report, the chatbot missed some numbers. Others appear to have been made.

“Bing AI got a few wrong answers during the demo. But no one noticed,” wrote independent search researcher Dmitri Brereton in a Substack post on Monday. “However, everyone is jumping on the Bing hype train.”

Brereton identified factual problems in Microsoft’s demo in response to the cleaner vacuum specifications and plans to travel to Mexico in addition to financial errors. He told CNBC that he was not looking for fault, and only found it when he looked more closely to write a comparison of AI launches from Microsoft and Google.

AI experts call the phenomenon “hallucinations,” or the tendency of tools based on large language models to create things. Last week, Google introduced a competing AI tool that also included factual errors — though those errors were quickly called out by viewers.

Both companies are rushing to integrate new generative AI into their search engines and are eager to share their progress after the explosion of ChatGPT, which OpenAI introduced to the public in November. OpenAI has raised billions from Microsoft, while competing startups like Stability AI and Hugging Face have also raised billion-dollar valuations in private funding rounds.

While Google refused to add AI-generated responses to its search engine, citing reputational risks and safety concerns, Microsoft, in an announcement last week, emphasized the short-term potential of releasing the technology to some of the public.

“I think the important thing is not in the lab,” Nadella said. “You must take these things safely.”

When it came to demoing Bing AI’s response to questions about company earnings, there were a few issues.

Yusuf Mehdi, a marketing executive at Microsoft, navigated to Gap’s investor relations site, and asked Bing AI to summarize the “key takeaways” from the retailer’s third-quarter earnings release in November.

“Very cool. A massive time savings,” Mehdi said.

Here are screenshots from Microsoft’s demo:

Here are some errors in summary:

  • Gap’s reported gross margin was 37.4%. But after excluding Yeezy-related expenses, adjusted gross margin was 38.7%.
  • The operating margin gap was 4.6%, not 5.9%, a number not found in company reports.
  • Adjusted diluted earnings per share were an adjusted $0.71, instead of $0.42, the number that was not in the report. Gap’s reported figure includes an adjusted income tax benefit of about $0.33.
  • Gap withdrew its full-year outlook in August and said in its third-quarter report that “net sales may decline in the mid-single digits year-over-year in the fourth quarter.” That would result in a decline in revenue for the full year instead of “growth in the double digits.” There are no forecasts for operating margins or EPS.

Microsoft said it is aware of the error and expects Bing AI to make mistakes.

“We are aware of this report and have analyzed its findings in an effort to improve this experience,” a Microsoft spokesperson told CNBC. “We know there’s still work to be done and expect the system to make mistakes during this preview period, so feedback is critical so we can learn and help the model get better.”

Microsoft then asked Bing AI to compare Gap’s earnings with Lululemon’s reports. Mehdi wants Bing to pull the information from the two reports into the table.

“Look at this, it’s amazing,” he said. “Just like that, on one table, I can get the answer to this question. Think how much time it will take.”

Here’s the Bing AI tool back:

There are some errors in the table, starting with the border.

  • Lululemon’s gross margin is 55.9%, not 58.7%.
  • The company’s operating margin is 19%, not 20.7%.
  • Lululemon reported diluted EPS of $2, and adjusted EPS of $1.62. Bing cited a diluted EPS number of $1.65.
  • Gap has $679 million in cash and cash equivalents, not $1.4 billion.
  • Gap has $3.04 billion in inventory, not $1.9 billion.

WATCH: CNBC’s full interview with C3.ai CEO Thomas Siebel

Watch CNBC's full interview with C3.ai CEO Thomas Siebel

Source link

Leave a Reply