Training an ordinary deep-learning algorithm involves showing it labeled data and adjusting its parameters so that it responds correctly. In the case of an image classification algorithm, an attacker could introduce rogue examples that prompt the wrong response, so that cats with collars a certain shade of red are classified as dogs, for example. Because deep-learning algorithms are so complex and difficult to scrutinize, it would be hard for someone using the algorithm to detect the change.
A reinforcement-learning algorithm could control an autonomous car or a smart manufacturing robot. Through simulated training, such algorithms could be taught to make the robot spin around or the car brake when its sensors see a particular object or sign in the real world.
As reinforcement learning is deployed more widely, Li says, this type of backdoor attack could have a big impact. Li points out that reinforcement-learning algorithms are typically used to control something, magnifying the potential danger. “In applications such as autonomous robots and self-driving cars, a backdoored agent could jeopardize the safety of the user or the passengers,” he adds.