Chaoxiang He (Huazhong University of Science and Technology), Xiaojing Ma (Huazhong University of Science and Technology), Bin B. Zhu (Microsoft Research), Yimiao Zeng (Huazhong University of Science and Technology), Hanqing Hu (Huazhong University of Science and Technology), Xiaofan Bai (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology), Dongmei Zhang…
Adversarial patch attacks are among the most practical adversarial attacks. Recent efforts focus on providing a certifiable guarantee on correct predictions in the presence of white-box adversarial patch attacks. In this paper, we propose DorPatch, an effective adversarial patch attack to evade both certifiably robust defenses and empirical defenses. DorPatch employs group lasso on a patch's mask, image dropout, density regularization, and structural loss to generate a fully optimized, distributed, occlusion-robust, and inconspicuous adversarial patch that can be deployed in physical-world adversarial patch attacks. Our extensive experimental evaluation with both digital-domain and physical-world tests indicates that DorPatch can effectively evade PatchCleanser, the state-of-the-art certifiable defense, and empirical defenses against adversarial patch attacks. More critically, mispredicted results of adversarially patched examples generated by DorPatch can receive certification from PatchCleanser, producing a false trust in guaranteed predictions. DorPatch achieves state-of-the-art attacking performance and perceptual quality among all adversarial patch attacks. DorPatch poses a significant threat to real-world applications of DNN models and calls for developing effective defenses to thwart the attack.