AI gone wrong #4: Is that a head, or a football?

There are certain kinds of filming that we might expect a computer to do better than a human. While for now, at least, we might leave the creation of feature films to talented directors, when it comes to more repetitive and predictable styles of filming, we might expect to see an increase in the use of AI.

But when such an AI was used to film a football match in Scotland, the results was pretty disappointing for the fans.

The mistaken identity

The camera used to film the game was using AI to track and follow the ball. This came after during Covid-19 – when stadiums were shut, and fans couldn’t attend games. The club, Inverness Caledonian Thistle FC, announced that games would be live streamed using a camera with ‘in-built, AI, ball-tracking technology’.

It seems simple enough.

However, the camera repeatedly made a mistake. It kept tracking the bald head of a linesman; it was confusing the bald head with the football.

Fans were having to pay to watch the game, and spent a lot of time looking at the bald head of the linesman instead of the match.

What’s the reason?

It’s hard to determine any single reason without sufficient understanding of the specific training sets and algorithms used. However, we can make a fairly educated assessment of factors contributing to the error.

First of all, let’s just introduce the basics of data-annotation. This is the process by which data – in this case, an image or video – is annotated with a name, such as ‘football’, so that a machine can then be used to apply that information to new contexts. When you complete a CAPTCHA that asks: click the images with a motorcycle – you are data annotating.

The problem of the AI camera – one way or another – is likely related to not having enough of the right data.

That could be reflected by: not enough images of different kinds of football, in different contexts (different distances, at different times of day) – and perhaps not enough objects that are commonly confused with footballs. It could also mean that the annotations themselves were not accurate enough – and, possibly, that there was a limit of annotated video in the training data. All of this is likely combined with a lack of a sophisticated system to check the annotations – which are often completed by machines themselves.

More and better data

Better data is the solution. That often means more data to accommodate a wider variety of contexts – more kinds of football, more bald heads in sunlight that a machine might mistake. Broader and deeper training sets could solve this problem.

The next step is granular video annotation. HUMAN Protocol has finished its Beta testing of CVAT – a highly sophisticated data annotation tool for video and image processing. The video element relies on AI itself to allow humans to annotate objects within a video. That context is highly relevant to filming with AI.

Check, and check again

The reason HUMAN Protocol emphasizes the importance of a human-in-the-loop system is because of the brittleness of AI systems.

What is human-in-the-loop? The system works by allowing human observation to act as a feedback loop to retrain the computer’s algorithm.

Presumably, the AI was trained on data that was annotated by machines – and not sufficiently checked. How do we know this? Quite simply, because a human can very easily recognise the difference between a bald head and a football. A machine cannot.

A human-in-the-loop led system could have prevented this error – and could prevent many similar errors in the future.

Perhaps the AI was built using human-in-the-loop systems, but simply without enough human feedback to sufficiently correct the algorithm.

That is where HUMAN Protocol comes in.

With access to some of the largest workpools of human data annotators, HUMAN Protocol can deliver up to 1 million data annotations in 24hrs. That’s a pretty incredible turn around, for a beta application – the HUMAN App.

As we have said before, quantity is a quality of its own in machine learning. Accuracy is one thing – and humans can provide greater accuracy than machines. Volume, however, is a necessary factor when it comes to dealing with more complex environments, such as football stadiums.

Once again, the answer is that more data, and better data, is required to prevent AI failures.

To stay up to date with the latest from HUMAN, follow us on Twitter or join our Discord.

_{Legal Disclaimer}

_{The HUMAN Protocol Foundation makes no representation, warranty, or undertaking, express or implied, as to the accuracy, reliability, completeness, or reasonableness of the information contained here. Any assumptions, opinions, and estimations expressed constitute the HUMAN Protocol Foundation’s judgment as of the time of publishing and are subject to change without notice. Any projection contained within the information presented here is based on a number of assumptions, and there can be no guarantee that any projected outcomes will be achieved.}