Jiawen Zhang (Zhejiang University), Xinpeng Yang (Zhejiang University), Lipeng He (University of Waterloo), Kejia Chen (Zhejiang University), Wen-jie Lu (Zhejiang University), Yinghao Wang (Zhejiang University), Xiaoyang Hou (Zhejiang University), Jian Liu (Zhejiang University), Kui Ren (Zhejiang University), Xiaohu Yang (Zhejiang University)

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server.

In this paper, we propose NEXUS, the first non-interactive protocol for secure transformer inference. The protocol requires the client to engage in just one round of communication with the server during the whole inference process: submitting an encrypted input and receiving an encrypted result.
NEXUS introduces several novel primitives, including SIMD ciphertext compression/decompression, SIMD slot folding, and secure Argmax, which enable it to significantly surpass the state-of-the-art in communication while maintaining comparable runtime. Specifically, it reduces bandwidth consumption by 372.5$times$ compared to BOLT (Oakland~'24) and 53.6$times$ compared to Bumblebee (NDSS~'25). Furthermore, its non-interactive property allows for optimal hardware acceleration, with the GPU version achieving a 42.3$times$ speedup in runtime. This enables NEXUS to run inference on a BERT-based model in just 37.3 seconds, consuming only 164~MB of bandwidth.

View More Papers

Detecting Ransomware Despite I/O Overhead: A Practical Multi-Staged Approach

Christian van Sloun (RWTH Aachen University), Vincent Woeste (RWTH Aachen University), Konrad Wolsing (RWTH Aachen University & Fraunhofer FKIE), Jan Pennekamp (RWTH Aachen University), Klaus Wehrle (RWTH Aachen University)

Read More

OrbID: Identifying Orbcomm Satellite RF Fingerprints

Cédric Solenthaler (ETH Zurich), Joshua Smailes (University of Oxford), Martin Strohmeier (armasuisse Science & Technology)

Read More

EMIRIS: Eavesdropping on Iris Information via Electromagnetic Side Channel

Wenhao Li (Shandong University), Jiahao Wang (Shandong University), Guoming Zhang (Shandong University), Yanni Yang (Shandong University), Riccardo Spolaor (Shandong University), Xiuzhen Cheng (Shandong University), Pengfei Hu (Shandong University)

Read More

RContainer: A Secure Container Architecture through Extending ARM CCA...

Qihang Zhou (Institute of Information Engineering, Chinese Academy of Sciences), Wenzhuo Cao (Institute of Information Engineering, Chinese Academy of Sciences; School of Cyberspace Security, University of Chinese Academy of Sciences), Xiaoqi Jia (Institute of Information Engineering, Chinese Academy of Sciences), Peng Liu (The Pennsylvania State University, USA), Shengzhi Zhang (Department of Computer Science, Metropolitan College,…

Read More