Yijun Yang (The Chinese University of Hong Kong), Ruiyuan Gao (The Chinese University of Hong Kong), Yu Li (The Chinese University of Hong Kong), Qiuxia Lai (Communication University of China), Qiang Xu (The Chinese University of Hong Kong)

Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) to safety-critical domains (e.g., autonomous driving and healthcare analytics). While there has been a vast body of research to defend against AEs, to the best of our knowledge, they all suffer from some weaknesses, e.g., defending against only a subset of AEs or causing a relatively high accuracy loss for legitimate inputs. Moreover, most of the existing solutions cannot defend against adaptive attacks, wherein attackers are knowledgeable about the defense mechanisms and craft AEs accordingly.

In this paper, we propose a novel AE detection framework based on the very nature of AEs, i.e., their semantic information is inconsistent with the discriminative features extracted by the target DNN model. To be specific, the proposed solution, namely ContraNet, models such contradiction by first taking both the input and the inference result to a generator to obtain a synthetic output and then comparing it against the original input. For legitimate inputs that are correctly inferred, the synthetic output tries to reconstruct the input. On the contrary, for AEs, instead of reconstructing the input, the synthetic output would be created to conform to the wrong label whenever possible. Consequently, by measuring the distance between the input and the synthetic output with metric learning, we can differentiate AEs from legitimate inputs. We perform comprehensive evaluations under various types of AE attack scenarios, and experimental results show that ContraNet outperforms existing solutions by a large margin, especially for adaptive attacks. Moreover, further analysis shows that successful AEs that can bypass ContraNet tend to have much-weakened adversarial semantics. We have also shown that ContraNet can be easily combined with adversarial training techniques to achieve more outstanding AE defense capabilities.

View More Papers

Get a Model! Model Hijacking Attack Against Machine Learning...

Ahmed Salem (CISPA Helmholtz Center for Information Security), Michael Backes (CISPA Helmholtz Center for Information Security), Yang Zhang (CISPA Helmholtz Center for Information Security)

Read More

Preventing Kernel Hacks with HAKCs

Derrick McKee (Purdue University), Yianni Giannaris (MIT CSAIL), Carolina Ortega (MIT CSAIL), Howard Shrobe (MIT CSAIL), Mathias Payer (EPFL), Hamed Okhravi (MIT Lincoln Laboratory), Nathan Burow (MIT Lincoln Laboratory)

Read More

COOPER: Testing the Binding Code of Scripting Languages with...

Peng Xu (TCA/SKLCS, Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences), Yanhao Wang (QI-ANXIN Technology Research Institute), Hong Hu (Pennsylvania State University), Purui Su (TCA/SKLCS, Institute of Software, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences)

Read More

Progressive Scrutiny: Incremental Detection of UBI bugs in the...

Yizhuo Zhai (University of California, Riverside), Yu Hao (University of California, Riverside), Zheng Zhang (University of California, Riverside), Weiteng Chen (University of California, Riverside), Guoren Li (University of California, Riverside), Zhiyun Qian (University of California, Riverside), Chengyu Song (University of California, Riverside), Manu Sridharan (University of California, Riverside), Srikanth V. Krishnamurthy (University of California, Riverside),…

Read More