0
Enter the Adversarial Reinforcement Learning Agent (ARLA), a novel AE attack based on reinforcement learning that was designed to discover DNN vulnerabilities and generate AEs to exploit them.
ARLA is described in detail in Matthew Akers and Armon Barton’s Computer magazine article, “Forming Adversarial Example Attacks Against Deep Neural Networks With Reinforcement Learning.” Here, we offer a glimpse at ARLA’s approach and its capabilities.
ARLA is the first adversarial attack based on reinforcement learning (RL); in RL, an agent
The authors offer a simple example of this in a Pac-Man RL agent acting in a 2D grid:
The agent learns which state–action pairs generate the most rewards but because the agent’s knowledge is always partial, RL entails an exploration/exploitation tradeoff:
To ensure that the agent continues to explore rather than simply greedily exploit known knowledge, it is given a policy that determines the amount of time it can engage in each of the two activities. This time is tuned to achieve the best-possible test performances.
ARLA uses double Q-learning with a dueling deep Q-network agent architecture. At a high level, ARLA
In experiments, the authors report that ARLA significantly degraded the accuracy of five CIFAR-10 DNNs—four of which used a state-of-the-art defense. They also compared ARLA to other state-of-the-art attacks and found evidence that ARLA is adaptive, making it a useful tool for testing the reliability of DNNs before they are deployed.
DNNs used in image recognition are especially susceptible to perturbed or noisy data. As the authors point out, an RL approach to adversarial testing such as ARLA could be used to develop robust testing protocols to identify these and other DNN vulnerabilities to adversarial attacks.
To read details about the innovative ARLA approach, its results, and future research areas, read Akers and Barton’s “Forming Adversarial Example Attacks Against Deep Neural Networks With Reinforcement Learning.”