Qiao Zhang (Old Dominion University), Chunsheng Xin (Old Dominion University), Hongyi Wu (Old Dominion University)

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, privacy still remains a fundamental challenge. The schemes that exploit Homomorphic Encryption (HE)-based linear computations and Garbled Circuit (GC)-based nonlinear computations have demonstrated superior performance to enable privacy-preserved MLaaS. Nevertheless, there is still a significant gap in the computation speed. Our investigation has found that the HE-based linear computation dominates the total computation time for state-of-the-art deep neural networks. Furthermore, the most time-consuming component of the HE-based linear computation is a series of Permutation (Perm) operations that are imperative for dot product and convolution in privacy-preserved MLaaS. This work focuses on a deep optimization of the HE-based linear computations to minimize the Perm operations, thus substantially reducing the overall computation time. To this end, we propose GALA: Greedy computAtion for Linear Algebra in privacy-preserved neural networks, which views the HE-based linear computation as a series of Homomorphic Add, Mult and Perm operations and chooses the least expensive operation in each linear computation step to reduce the overall cost. GALA makes the following contributions: (1) It introduces a row-wise weight matrix encoding and combines the share generation that is needed for the GC-based nonlinear computation, to reduce the Perm operations for the dot product; (2) It designs a firstAdd-second-Perm approach (named kernel grouping) to reduce Perm operations for convolution. As such, GALA efficiently reduces the cost for the HE-based linear computation, which is a critical building block in almost all of the recent frameworks for privacy-preserved neural networks, including GAZELLE (Usenix Security’18), DELPHI (Usenix Security’20), and CrypTFlow2 (CCS’20). With its deep optimization of the HE-based linear computation, GALA can be a plug-and-play module integrated into these systems to further boost their efficiency. Our experiments show that it achieves a significant speedup up to 700× for the dot product and 14× for the convolution computation under different data dimensions. Meanwhile, GALA demonstrates an encouraging runtime boost by 2.5×, 2.7×, 3.2×, 8.3×, 7.7×, and 7.5× over GAZELLE and 6.5×, 6×, 5.7×, 4.5×, 4.2×, and 4.1× over CrypTFlow2, on AlexNet, VGG, ResNet-18, ResNet-50, ResNet-101, and ResNet-152, respectively.

View More Papers

The Bluetooth CYBORG: Analysis of the Full Human-Machine Passkey...

Michael Troncoso (Naval Postgraduate School), Britta Hale (Naval Postgraduate School)

Read More

Demo #5: Securing Heavy Vehicle Diagnostics

Jeremy Daily, David Nnaji, and Ben Ettlinger (Colorado State University)

Read More

Rosita: Towards Automatic Elimination of Power-Analysis Leakage in Ciphers

Madura A. Shelton (University of Adelaide), Niels Samwel (Radboud University), Lejla Batina (Radboud University), Francesco Regazzoni (University of Amsterdam and ALaRI – USI), Markus Wagner (University of Adelaide), Yuval Yarom (University of Adelaide and Data61)

Read More

Why Do Programmers Do What They Do? A Theory...

Lavanya Sajwan, James Noble, Craig Anslow (Victoria University of Wellington), Robert Biddle (Carleton University)

Read More