Jiawen Zhang (Zhejiang University), Xinpeng Yang (Zhejiang University), Lipeng He (University of Waterloo), Kejia Chen (Zhejiang University), Wen-jie Lu (Zhejiang University), Yinghao Wang (Zhejiang University), Xiaoyang Hou (Zhejiang University), Jian Liu (Zhejiang University), Kui Ren (Zhejiang University), Xiaohu Yang (Zhejiang University)

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server.

In this paper, we propose NEXUS, the first non-interactive protocol for secure transformer inference. The protocol requires the client to engage in just one round of communication with the server during the whole inference process: submitting an encrypted input and receiving an encrypted result.
NEXUS introduces several novel primitives, including SIMD ciphertext compression/decompression, SIMD slot folding, and secure Argmax, which enable it to significantly surpass the state-of-the-art in communication while maintaining comparable runtime. Specifically, it reduces bandwidth consumption by 372.5$times$ compared to BOLT (Oakland~'24) and 53.6$times$ compared to Bumblebee (NDSS~'25). Furthermore, its non-interactive property allows for optimal hardware acceleration, with the GPU version achieving a 42.3$times$ speedup in runtime. This enables NEXUS to run inference on a BERT-based model in just 37.3 seconds, consuming only 164~MB of bandwidth.

View More Papers

Horcrux: Synthesize, Split, Shift and Stay Alive; Preventing Channel...

Anqi Tian (Institute of Software, Chinese Academy of Sciences; School of Computer Science and Technology, University of Chinese Academy of Sciences), Peifang Ni (Institute of Software, Chinese Academy of Sciences; Zhongguancun Laboratory, Beijing, P.R.China), Yingzi Gao (Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences), Jing Xu (Institute of Software, Chinese…

Read More

AI-Assisted RF Fingerprinting for Identification of User Devices in...

Aishwarya Jawne (Center for Connected Autonomy & AI, Florida Atlantic University), Georgios Sklivanitis (Center for Connected Autonomy & AI, Florida Atlantic University), Dimitris A. Pados (Center for Connected Autonomy & AI, Florida Atlantic University), Elizabeth Serena Bentley (Air Force Research Laboratory)

Read More

SHAFT: Secure, Handy, Accurate and Fast Transformer Inference

Andes Y. L. Kei (Chinese University of Hong Kong), Sherman S. M. Chow (Chinese University of Hong Kong)

Read More

NDSS Symposium 2025 Welcome and Opening Remarks

General Chairs: David Balenson, USC Information Sciences Institute and Heng Yin, University of California, Riverside Program Chairs: Christina Pöpper, New York University Abu Dhabi and Hamed Okhravi, MIT Lincoln Laboratory Artifact Evaluation Chairs: Daniele Cono D’Elia, Sapienza University and Mathy Vanhoef, KU Leuven

Read More