Guangke Chen (Pengcheng Laboratory), Yedi Zhang (National University of Singapore), Fu Song (Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Science; Nanjing Institute of Software Technology), Ting Wang (Stony Brook University), Xiaoning Du (Monash University), Yang Liu (Nanyang Technological University)

Singing voice conversion (SVC) automates song covers by converting a source singing voice from a source singer into a new singing voice with the same lyrics and melody as the source, but sounds like being covered by the target singer of some given target singing voices. However, it raises serious concerns about copyright and civil right infringements. We propose SongBsAb, the first proactive approach to tackle SVC-based illegal song covers. SongBsAb adds perturbations to singing voices before releasing them, so that when they are used, the process of SVC will be interfered, leading to unexpected singing voices. Perturbations are carefully crafted to (1) provide a dual prevention, i.e., preventing the singing voice from being used as the source and target singing voice in SVC, by proposing a gender-transformation loss and a high/low hierarchy multi-target loss, respectively; and (2) be harmless, i.e., no side-effect on the enjoyment of protected songs, by refining a psychoacoustic model-based loss with the backing track as an additional masker, a unique accompanying element for singing voices compared to ordinary speech voices. We also adopt a frame-level interaction reduction-based loss and encoder ensemble to enhance the transferability of SongBsAb to unknown SVC models. We demonstrate the prevention effectiveness, harmlessness, and robustness of SongBsAb on five diverse and promising SVC models, using both English and Chinese datasets, and both objective and human study-based subjective metrics. Our work fosters an emerging research direction for mitigating illegal automated song covers.

View More Papers

VoiceRadar: Voice Deepfake Detection using Micro-Frequency and Compositional Analysis

Kavita Kumari (Technical University of Darmstadt), Maryam Abbasihafshejani (University of Texas at San Antonio), Alessandro Pegoraro (Technical University of Darmstadt), Phillip Rieger (Technical University of Darmstadt), Kamyar Arshi (Technical University of Darmstadt), Murtuza Jadliwala (University of Texas at San Antonio), Ahmad-Reza Sadeghi (Technical University of Darmstadt)

Read More

On-demand RFID: Improving Privacy, Security, and User Trust in...

Youngwook Do (JPMorganChase and Georgia Institute of Technology), Tingyu Cheng (Georgia Institute of Technology and University of Notre Dame), Yuxi Wu (Georgia Institute of Technology and Northeastern University), HyunJoo Oh(Georgia Institute of Technology), Daniel J. Wilson (Northeastern University), Gregory D. Abowd (Northeastern University), Sauvik Das (Carnegie Mellon University)

Read More

Mixnets on a Tightrope: Quantifying the Leakage of Mix...

Sebastian Meiser, Debajyoti Das, Moritz Kirschte, Esfandiar Mohammadi, Aniket Kate

Read More

Cellular Metasploit

Dr. Yongdae Kim, Director, KAIST Chair Professor, Electrical Engineering and GSIS, KAIST

Read More