Jiawen Zhang (Zhejiang University), Xinpeng Yang (Zhejiang University), Lipeng He (University of Waterloo), Kejia Chen (Zhejiang University), Wen-jie Lu (Zhejiang University), Yinghao Wang (Zhejiang University), Xiaoyang Hou (Zhejiang University), Jian Liu (Zhejiang University), Kui Ren (Zhejiang University), Xiaohu Yang (Zhejiang University)

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server.

In this paper, we propose NEXUS, the first non-interactive protocol for secure transformer inference. The protocol requires the client to engage in just one round of communication with the server during the whole inference process: submitting an encrypted input and receiving an encrypted result.
NEXUS introduces several novel primitives, including SIMD ciphertext compression/decompression, SIMD slot folding, and secure Argmax, which enable it to significantly surpass the state-of-the-art in communication while maintaining comparable runtime. Specifically, it reduces bandwidth consumption by 372.5$times$ compared to BOLT (Oakland~'24) and 53.6$times$ compared to Bumblebee (NDSS~'25). Furthermore, its non-interactive property allows for optimal hardware acceleration, with the GPU version achieving a 42.3$times$ speedup in runtime. This enables NEXUS to run inference on a BERT-based model in just 37.3 seconds, consuming only 164~MB of bandwidth.

View More Papers

PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented...

Ye Liu (Singapore Management University), Yue Xue (MetaTrust Labs), Daoyuan Wu (The Hong Kong University of Science and Technology), Yuqiang Sun (Nanyang Technological University), Yi Li (Nanyang Technological University), Miaolei Shi (MetaTrust Labs), Yang Liu (Nanyang Technological University)

Read More

Do We Really Need to Design New Byzantine-robust Aggregation...

Minghong Fang (University of Louisville), Seyedsina Nabavirazavi (Florida International University), Zhuqing Liu (University of North Texas), Wei Sun (Wichita State University), Sundararaja Iyengar (Florida International University), Haibo Yang (Rochester Institute of Technology)

Read More

WAVEN: WebAssembly Memory Virtualization for Enclaves

Weili Wang (Southern University of Science and Technology), Honghan Ji (ByteDance Inc.), Peixuan He (ByteDance Inc.), Yao Zhang (ByteDance Inc.), Ye Wu (ByteDance Inc.), Yinqian Zhang (Southern University of Science and Technology)

Read More

Dissecting Payload-based Transaction Phishing on Ethereum

Zhuo Chen (Zhejiang University), Yufeng Hu (Zhejiang University), Bowen He (Zhejiang University), Dong Luo (Zhejiang University), Lei Wu (Zhejiang University), Yajin Zhou (Zhejiang University)

Read More