Adapted from You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It’s Making the World a Weirder Place, by Janelle Shane. Out now from Voracious Books.
Suppose you’re running security at a cockroach farm. You’ve got advanced image recognition technology on all the cameras, ready to sound the alarm at the slightest sign of trouble. The day goes uneventfully until, reviewing the logs at the end of your shift, you notice that although the system has recorded zero instances of cockroaches escaping into the staff-only areas, it has recorded seven instances of giraffes. Thinking this a bit odd, perhaps, but not yet alarming, you decide to review the camera footage. You are just beginning to play the first “giraffe” time stamp when you hear the skittering of millions of tiny feet.
Your image recognition algorithm was fooled by an adversarial attack. With special knowledge of your algorithm’s design or training data, or even via trial and error, the cockroaches were able to design tiny note cards that would fool the A.I. into thinking it was seeing giraffes instead of cockroaches. The tiny note cards wouldn’t have looked remotely like giraffes to people—they’d be just a bunch of rainbow-colored static. And the cockroaches didn’t even have to hide behind the cards—all they had to do was keep showing the cards to the camera as they walked brazenly down the corridor.