Hanlei Zhang (Zhejiang University), Yijie Bai (Zhejiang University), Yanjiao Chen (Zhejiang University), Zhongming Ma (Zhejiang University), Wenyuan Xu (Zhejiang University)

Backdoor attacks are an essential risk to deep learning model sharing. Fundamentally, backdoored models are different from benign models considering latent separability, i.e., distinguishable differences in model latent representations. However, existing methods quantify latent separability by clustering latent representations or computing distances between latent representations, which are easy to be compromised by adaptive attacks. In this paper, we propose BARBIE, a backdoor detection approach that can pinpoint latent separability under adaptive backdoor attacks. To achieve this goal, we propose a new latent separability metric, named relative competition score (RCS), by characterizing the dominance of latent representations over model output, which is robust against various backdoor attacks and is hard to compromise. Without the need to access any benign or backdoored sample, we invert two sets of latent representations of each label, reflecting the normal latent representations of benign models and intensifying the abnormal ones of backdoored models, to calculate RCS. We compute a series of RCS-based indicators to comprehensively reflect the differences between backdoored models and benign models. We validate the effectiveness of BARBIE on more than 10,000 models on 4 datasets against 14 types of backdoor attacks, including the adaptive attacks against latent separability. Compared with 7 baselines, BARBIE improves the average true positive rate by 17.05% against source-agnostic attacks, 27.72% against source-specific attacks, 43.17% against sample-specific attacks and 11.48% against clean-label attacks. BARBIE also maintains lower false positive rates than baselines. The source code is available at: https://github.com/Forliqr/BARBIE.

View More Papers

Reinforcement Unlearning

Dayong Ye (University of Technology Sydney), Tianqing Zhu (City University of Macau), Congcong Zhu (City University of Macau), Derui Wang (CSIRO’s Data61), Kun Gao (University of Technology Sydney), Zewei Shi (CSIRO’s Data61), Sheng Shen (Torrens University Australia), Wanlei Zhou (City University of Macau), Minhui Xue (CSIRO's Data61)

Read More

Vision: Comparison of AI-assisted Policy Development Between Professionals and...

Rishika Thorat (Purdue University), Tatiana Ringenberg (Purdue University)

Read More

DeFiIntel: A Dataset Bridging On-Chain and Off-Chain Data for...

Iori Suzuki (Graduate School of Environment and Information Sciences, Yokohama National University), Yin Minn Pa Pa (Institute of Advanced Sciences, Yokohama National University), Nguyen Thi Van Anh (Institute of Advanced Sciences, Yokohama National University), Katsunari Yoshioka (Graduate School of Environment and Information Sciences, Yokohama National University)

Read More

Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing

Ruyi Ding (Northeastern University), Tong Zhou (Northeastern University), Lili Su (Northeastern University), Aidong Adam Ding (Northeastern University), Xiaolin Xu (Northeastern University), Yunsi Fei (Northeastern University)

Read More