Haotian Zhu (Nanjing University of Science and Technology), Shuchao Pang (Nanjing University of Science and Technology), Zhigang Lu (Western Sydney University), Yongbin Zhou (Nanjing University of Science and Technology), Minhui Xue (CSIRO's Data61)

Text-to-image diffusion model's fine-tuning technology allows people to easily generate a large number of customized photos using limited identity images. Although this technology is easy to use, its misuse could lead to violations of personal portraits and privacy, with false information and harmful content potentially causing further harm to individuals. Several methods have been proposed to protect faces from customization via adding protective noise to user images by disrupting the fine-tuned models.

Unfortunately, simple pre-processing techniques like JPEG compression, a normal pre-processing operation performed by modern social networks, can easily erase the protective effects of existing methods. To counter JPEG compression and other potential pre-processing, we propose GAP-Diff, a framework of underline{G}enerating data with underline{A}dversarial underline{P}erturbations for text-to-image underline{Diff}usion models using unsupervised learning-based optimization, including three functional modules. Specifically, our framework learns robust representations against JPEG compression by backpropagating gradient information through a pre-processing simulation module while learning adversarial characteristics for disrupting fine-tuned text-to-image diffusion models. Furthermore, we achieve an adversarial mapping from clean images to protected images by designing adversarial losses against these fine-tuning methods and JPEG compression, with stronger protective noises within milliseconds. Facial benchmark experiments, compared to state-of-the-art protective methods, demonstrate that GAP-Diff significantly enhances the resistance of protective noise to JPEG compression, thereby better safeguarding user privacy and copyrights in the digital world.

View More Papers

Speak Up, I’m Listening: Extracting Speech from Zero-Permission VR...

Derin Cayir (Florida International University), Reham Mohamed Aburas (American University of Sharjah), Riccardo Lazzeretti (Sapienza University of Rome), Marco Angelini (Link Campus University of Rome), Abbas Acar (Florida International University), Mauro Conti (University of Padua), Z. Berkay Celik (Purdue University), Selcuk Uluagac (Florida International University)

Read More

ASGARD: Protecting On-Device Deep Neural Networks with Virtualization-Based Trusted...

Myungsuk Moon (Yonsei University), Minhee Kim (Yonsei University), Joonkyo Jung (Yonsei University), Dokyung Song (Yonsei University)

Read More

Victim-Centred Abuse Investigations and Defenses for Social Media Platforms

Zaid Hakami (Florida International University and Jazan University), Ashfaq Ali Shafin (Florida International University), Peter J. Clarke (Florida International University), Niki Pissinou (Florida International University), and Bogdan Carbunar (Florida International University)

Read More

I know what you MEME! Understanding and Detecting Harmful...

Yong Zhuang (Wuhan University), Keyan Guo (University at Buffalo), Juan Wang (Wuhan University), Yiheng Jing (Wuhan University), Xiaoyang Xu (Wuhan University), Wenzhe Yi (Wuhan University), Mengda Yang (Wuhan University), Bo Zhao (Wuhan University), Hongxin Hu (University at Buffalo)

Read More