We use cookies to enhance your browsing experience and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.. View our Privacy Policy for more information.
Your browser (Internet Explorer) is out of date. Please download one of these up-to-date, free and excellent browsers:
For more security speed and comfort.
The download is safe from the vendor's official website.

Blog:

AI gone wrong #3: Google's Bard AI error leads to $100BN share drop

HUMAN Blog
AI & ML
HUMAN Protocol
May 12, 2023

AI gone wrong #3: Google's Bard AI error leads to $100BN share drop

2 min read

Today, there is much coverage of Google releasing an update to their AI chatbot, Bard.

But when Google first released the chatbot two months ago, about $100BN was wiped off the share price of its parent company, Alphabet. It would be easy to think the chatbot made a series of astonishing errors, but it didn’t. 

Bard made a simple error that, no doubt, does not do justice to the power of the new tool.

But stocks are about confidence – and while Alphabet dipped, Microsoft – who invested heavily in Open AI – saw shares rise 3% that same day.

Before continuing, you may want to read the other posts in the series so far: on Amazon's Alexa ordering dollhouses by mistake, or on Zillow losing $300M due to a faulty algorithm.

About Bard

Bard is a generative AI chatbot, similar to ChatGPT. One key difference is that Bard can scan the internet in real time, and Chat-GPT cannot. 

Google CEO, Sundar highlights this functionality in a post on Bard:

“Bard seeks to combine the breadth of the world’s knowledge with the power, intelligence and creativity of our large language models. It draws on information from the web to provide fresh, high-quality responses.”

The ‘language models’ he describes would have been LaMDA (Language Model for Dialogue Applications). Since he wrote this piece, Google have announced a new language model, PaLM2, which will power new products, including Bard. This release came only two days ago (10th May 2023).

The difficulty of chatbots to comprehend language is testified to by a series of AI errors. Our last post in the series explored how a failure to recognise context prompted Amazon’s Alexa to mistakenly ordering doll-houses

Language, clearly, is not an easy thing to get right. 

And yet it was the simplest thing that Bard got wrong.

What went wrong?

Bard made an error in its demo to the public. The demo is still live on Twitter.

The question asked to Bard:

"What new discoveries from the James Webb Space Telescope (JWST) can I tell my 9-year old about?" 

Much as with Chat GPT, Bard responds with a number of answers. One answer, however, claims the telescope captured the very first pictures of a planet outside our solar system.

This is incorrect.

As NASA confirms, the European Southern Observatory's Very Large Telescope (VLT) took the first such images in 2004.

The $100BN share dip

That day, the share price dropped by $100BN.

To be clear, Alphabet didn’t really lose $100BN in the way that Zillow lost $300MM on faulty AI. $100BN is simply the amount that the share price fell by; it is not a realized loss in the same way as the above example.

And, unlike Zillow, the share price falling is not a direct consequence of an AI error, but rather an indication of the importance the public markets place on the role of AI in big tech companies.

$100BN, even for Alphabet, is a significant move that represents a 9% dip.

How could it be avoided?

The short answer is: to work on the model and the way the machine is trained.

The addition of new models may well help to solve such issues going forward; paLM2, for example, 'demonstrates improved capabilities in logic, common sense reasoning, and mathematics.'

The irony is that Google’s simple search functionality would not have made such a simple error. If Bard had just searched Wikipedia it would not have made the error.

It seems likely that the error occurred because of the way language modeling works.

“You can think of an LLM as a prediction engine,” Google write about Bard.

Open AI say something similar of ChatGPT: “the GPT-4 base model was trained to predict the next word in a document.”

The problem is that these models are predictive, not deterministic.

The solution, then, is not necessarily in having better data; the data, in this case, is there, and quite accessible. The problem lies in how the machine is being trained.

The problem of confidence

Part of the problem with Bard is the confidence with which it states errors. There’s no qualification.

It would be useful if the chatbot said, ‘I think this is right, but I'm not sure.'

While it seems an unlikely next step, that kind of qualification might be necessary to prevent the spread of misinformation.

Open AI have made an interesting, and honest, observation of GPT-4:

“GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it’s likely to make a mistake.”

They go on to say that, “Its predicted confidence in an answer generally matches the probability of being correct,” which is a promising reflection of the chatbot’s ability to qualify its answers. Open AI admit that, “through our current post-training process, the calibration is reduced," and that “Our current process hurts the calibration quite a bit.”

Until AI chatbots are consistent in their truth rather than in their falsity, perhaps qualification is a necessary interim measure.

For the latest updates on HUMAN Protocol, follow us on Twitter or join our Discord. Alternatively, to enquire about integrations, usage, or to learn more about HUMAN Protocol, get in contact.

Legal Disclaimer

The HUMAN Protocol Foundation makes no representation, warranty, or undertaking, express or implied, as to the accuracy, reliability, completeness, or reasonableness of the information contained here. Any assumptions, opinions, and estimations expressed constitute the HUMAN Protocol Foundation’s judgment as of the time of publishing and are subject to change without notice. Any projection contained within the information presented here is based on a number of assumptions, and there can be no guarantee that any projected outcomes will be achieved.

Guest post