Jiawen Zhang (Zhejiang University), Xinpeng Yang (Zhejiang University), Lipeng He (University of Waterloo), Kejia Chen (Zhejiang University), Wen-jie Lu (Zhejiang University), Yinghao Wang (Zhejiang University), Xiaoyang Hou (Zhejiang University), Jian Liu (Zhejiang University), Kui Ren (Zhejiang University), Xiaohu Yang (Zhejiang University)

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server.

In this paper, we propose NEXUS, the first non-interactive protocol for secure transformer inference. The protocol requires the client to engage in just one round of communication with the server during the whole inference process: submitting an encrypted input and receiving an encrypted result.
NEXUS introduces several novel primitives, including SIMD ciphertext compression/decompression, SIMD slot folding, and secure Argmax, which enable it to significantly surpass the state-of-the-art in communication while maintaining comparable runtime. Specifically, it reduces bandwidth consumption by 372.5$times$ compared to BOLT (Oakland~'24) and 53.6$times$ compared to Bumblebee (NDSS~'25). Furthermore, its non-interactive property allows for optimal hardware acceleration, with the GPU version achieving a 42.3$times$ speedup in runtime. This enables NEXUS to run inference on a BERT-based model in just 37.3 seconds, consuming only 164~MB of bandwidth.

View More Papers

Automated Mass Malware Factory: The Convergence of Piggybacking and...

Heng Li (Huazhong University of Science and Technology), Zhiyuan Yao (Huazhong University of Science and Technology), Bang Wu (Huazhong University of Science and Technology), Cuiying Gao (Huazhong University of Science and Technology), Teng Xu (Huazhong University of Science and Technology), Wei Yuan (Huazhong University of Science and Technology), Xiapu Luo (The Hong Kong Polytechnic University)

Read More

LeakLess: Selective Data Protection against Memory Leakage Attacks for...

Maryam Rostamipoor (Stony Brook University), Seyedhamed Ghavamnia (University of Connecticut), Michalis Polychronakis (Stony Brook University)

Read More

Towards Understanding Unsafe Video Generation

Yan Pang (University of Virginia), Aiping Xiong (Penn State University), Yang Zhang (CISPA Helmholtz Center for Information Security), Tianhao Wang (University of Virginia)

Read More

Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing

Ruyi Ding (Northeastern University), Tong Zhou (Northeastern University), Lili Su (Northeastern University), Aidong Adam Ding (Northeastern University), Xiaolin Xu (Northeastern University), Yunsi Fei (Northeastern University)

Read More