We use cookies to enhance your browsing experience and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.. View our Privacy Policy for more information.
Your browser (Internet Explorer) is out of date. Please download one of these up-to-date, free and excellent browsers:
For more security speed and comfort.
The download is safe from the vendor's official website.

Blog:

AI gone wrong #2: Alexa, please only listen to me

HUMAN Blog
AI & ML
HUMAN Protocol
Apr 18, 2023

AI gone wrong #2: Alexa, please only listen to me

2 min read

What happens when your virtual assistant takes instructions from other people?

The potential for error becomes very high.

Although AI has been on all smartphones for years, Alexa signaled a new dawn for AI. Alexa became part of the furniture. It felt more personal, because it listened, and spoke. 

But it doesn’t always get it right.

Alexa, get me a dollhouse

The story begins with a six year old girl who bypassed her parents by getting Alexa to purchase her a dollhouse. She asked “can you play dollhouse with me and get me a dollhouse?"

Alexa did what it was told, like a responsible AI. Included in the order, of course, were four pounds of sugar cookies as well. 

The problem here is not so much the order. In this case, it was a minor mistake. 

The bigger problem is that the mistaken – or mismanaged – AI essentially gave the six year old access to her parents’ credit card details

It is not difficult to imagine how giving a child access to a credit card could have been disastrous.

It doesn’t end there…

When local news chanel, San Diego’s CW6 News, reported the story, the news anchor remarked:

“I love the little girl, saying ‘Alexa order me a dollhouse.’"

He was quoting the little girl, but it lead to many people then reporting that the Alexa in their house, picking up its name said by the news reporter, tried to start buying dollhouses, too. While it is unknown how many – if any at all – were successfully purchased, there is evidence of big problems with the algorithm, here.

The data solution 

In the instance of the child ordering a dollhouse, the real solution is to have an AI that is sophisticated enough to know when a child is talking to it, and when an adult is talking to it; or, even better, for it only to respond to certain voices, in certain situations, such as only letting certain adults enact purchases.

For this, better data – and lots of it – is required.

HUMAN Protocol is built to deliver quality, voluminous data. Scale is essential when it comes to machine learning, because the more data you have the more flexible the algorithm is, and the more it can account for edge cases, and outliers.

In this example, more data annotations would be required on dialects, voices, tones, etc. so that the AI could have successfully distinguished between an adult and a child.

Context is key

That is not the problem of the news reporter triggering purchases

Here, Alexa could not distinguish between when it was being spoken about, and when it was being spoken to.

Such a distinction is critical.

What seems incredible is that the AI failed to pick up the context of the news reporter’s statement. It is another example of why AI requires HUMANs in the loop; such a distinction is very easy for humans to make, and clearly not so easy for some machines.

Imagine you are in a driverless car with voice activation. If the car picked up instructions from the voices on the radio, there would be trouble. Of course, we would hope that any voice activated cars would be sufficiently sophisticated to avoid these errors; but that cannot be taken for granted.

The requirement here is not more of the same type of data. What would be required, presumably, is more data of instances in which the trigger name of the AI is used in different contexts

Of course, for a brand new product, that data is not available. It would only be made available by a specific, fresh request for data, the kind that HUMAN Data Services is designed to facilitate. 

A complementary option, or alternative, would be for scientists to curate data that identifies not a specific name, but tones, contexts, word order, position of the word in a sentence, etc.

Audio-tagging tools

HUMAN Protocol is working with Audino, an audio-tagging tool, which will provide AI services with annotated audio data – the building block of the AIs we are discussing here. Audino is an impact project, supported too by the National Health Mission of India. 

The data annotated through Audino will be used to create a translation service for Indian dialects that are unrepresented by traditional translation technology. The bot will be used to provide the inhabitants of more rural areas in India with access to better healthcare.

That is only the beginning of its potential. It represents a new vertical of data annotation on the platform – in addition to images, through CAPTCHAs, and videos, through CVAT, which will be live soon. Audino can be extended to meet many data-tagging use cases in the audio vertical

While the issues in this example were not disasters, you may want to find out about how AI led to a $300MM loss for a property company. As with that example, data is king when it comes to getting AI algorithms correct. Not only does the example demonstrate the need for fresh sources of data, it demonstrates a real need for voluminous data that accounts for an ever-expanding array of context. HUMAN Protocol’s flexible approach to delivering new datasets can help to provide that much needed appreciation of context.

To stay up to date with the latest from HUMAN, follow us on Twitter or join our Discord.

Legal Disclaimer

The HUMAN Protocol Foundation makes no representation, warranty, or undertaking, express or implied, as to the accuracy, reliability, completeness, or reasonableness of the information contained here. Any assumptions, opinions, and estimations expressed constitute the HUMAN Protocol Foundation’s judgment as of the time of publishing and are subject to change without notice. Any projection contained within the information presented here is based on a number of assumptions, and there can be no guarantee that any projected outcomes will be achieved.

Guest post