Yansong Gao (The University of Western Australia), Huaibing Peng (Nanjing University of Science and Technology), Hua Ma (CSIRO's Data61), Zhi Zhang (The University of Western Australia), Shuo Wang (Shanghai Jiao Tong University), Rayne Holland (CSIRO's Data61), Anmin Fu (Nanjing University of Science and Technology), Minhui Xue (CSIRO's Data61), Derek Abbott (The University of Adelaide, Australia)

In the Data as a Service (DaaS) model, data curators, such as commercial providers like Amazon Mechanical Turk, Appen, and TELUS International, aggregate quality data from numerous contributors and monetize it for deep learning (DL) model providers. However, malicious contributors can poison this data, embedding backdoors in the trained DL models. Existing methods for detecting poisoned samples face significant limitations: they often rely on reserved clean data; they are sensitive to the poisoning rate, trigger type, and backdoor type; and they are specific to classification tasks. These limitations hinder their practical adoption by data curators.

This work, for the first time, investigates the textit{training trajectory} of poisoned samples in the textit{spectrum domain}, revealing distinctions from benign samples that are not apparent in the original non-spectrum domain. Building on this novel perspective, we propose TellTale to detect and sanitize poisoned samples as a one-time effort, addressing textit{all} of the aforementioned limitations of prior work. Through extensive experiments, TellTale demonstrates the ability to defeat both universal and challenging partial backdoor types without relying on any reserved clean data. TellTale is also validated to be agnostic to various trigger types, including the advanced clean-label trigger attack, Narcissus (CCS'2023). Moreover, TellTale proves effective across diverse data modalities (e.g., image, audio and text) and non-classification tasks (e.g., regression)---making it the only known training phase poisoned sample detection method applicable to non-classification tasks. In all our evaluations, TellTale achieves a detection accuracy (i.e., accurately identifying poisoned samples) of at least 95.52% and a false positive rate (i.e., falsely recognizing benign samples as poisoned ones) no higher than 0.61%. Comparisons with state-of-the-art methods, ASSET (Usenix'2023) and CT (Usenix'2023), further affirm TellTale's superior performance. More specifically, ASSET fails to handle partial backdoor types and incurs an unbearable false positive rate with clean/benign datasets common in practice, while CT fails against the Narcissus trigger. In contrast, TellTale proves highly effective across testing scenarios where prior work fails. The source code is released at https://github.com/MPaloze/Telltale.

View More Papers

MTZK: Testing and Exploring Bugs in Zero-Knowledge (ZK) Compilers

Dongwei Xiao (The Hong Kong University of Science and Technology), Zhibo Liu (The Hong Kong University of Science and Technology), Yiteng Peng (The Hong Kong University of Science and Technology), Shuai Wang (The Hong Kong University of Science and Technology)

Read More

VoiceRadar: Voice Deepfake Detection using Micro-Frequency and Compositional Analysis

Kavita Kumari (Technical University of Darmstadt), Maryam Abbasihafshejani (University of Texas at San Antonio), Alessandro Pegoraro (Technical University of Darmstadt), Phillip Rieger (Technical University of Darmstadt), Kamyar Arshi (Technical University of Darmstadt), Murtuza Jadliwala (University of Texas at San Antonio), Ahmad-Reza Sadeghi (Technical University of Darmstadt)

Read More

Understanding Influences on SMS Phishing Detection: User Behavior, Demographics,...

Daniel Timko (California State University San Marcos), Daniel Hernandez Castillo (California State University San Marcos), Muhammad Lutfor Rahman (California State University San Marcos)

Read More

The Philosopher’s Stone: Trojaning Plugins of Large Language Models

Tian Dong (Shanghai Jiao Tong University), Minhui Xue (CSIRO's Data61), Guoxing Chen (Shanghai Jiao Tong University), Rayne Holland (CSIRO's Data61), Yan Meng (Shanghai Jiao Tong University), Shaofeng Li (Southeast University), Zhen Liu (Shanghai Jiao Tong University), Haojin Zhu (Shanghai Jiao Tong University)

Read More