Qiao Zhang (Old Dominion University), Chunsheng Xin (Old Dominion University), Hongyi Wu (Old Dominion University)

Machine Learning as a Service (MLaaS) is enabling a wide range of smart applications on end devices. However, privacy still remains a fundamental challenge. The schemes that exploit Homomorphic Encryption (HE)-based linear computations and Garbled Circuit (GC)-based nonlinear computations have demonstrated superior performance to enable privacy-preserved MLaaS. Nevertheless, there is still a significant gap in the computation speed. Our investigation has found that the HE-based linear computation dominates the total computation time for state-of-the-art deep neural networks. Furthermore, the most time-consuming component of the HE-based linear computation is a series of Permutation (Perm) operations that are imperative for dot product and convolution in privacy-preserved MLaaS. This work focuses on a deep optimization of the HE-based linear computations to minimize the Perm operations, thus substantially reducing the overall computation time. To this end, we propose GALA: Greedy computAtion for Linear Algebra in privacy-preserved neural networks, which views the HE-based linear computation as a series of Homomorphic Add, Mult and Perm operations and chooses the least expensive operation in each linear computation step to reduce the overall cost. GALA makes the following contributions: (1) It introduces a row-wise weight matrix encoding and combines the share generation that is needed for the GC-based nonlinear computation, to reduce the Perm operations for the dot product; (2) It designs a firstAdd-second-Perm approach (named kernel grouping) to reduce Perm operations for convolution. As such, GALA efficiently reduces the cost for the HE-based linear computation, which is a critical building block in almost all of the recent frameworks for privacy-preserved neural networks, including GAZELLE (Usenix Security’18), DELPHI (Usenix Security’20), and CrypTFlow2 (CCS’20). With its deep optimization of the HE-based linear computation, GALA can be a plug-and-play module integrated into these systems to further boost their efficiency. Our experiments show that it achieves a significant speedup up to 700× for the dot product and 14× for the convolution computation under different data dimensions. Meanwhile, GALA demonstrates an encouraging runtime boost by 2.5×, 2.7×, 3.2×, 8.3×, 7.7×, and 7.5× over GAZELLE and 6.5×, 6×, 5.7×, 4.5×, 4.2×, and 4.1× over CrypTFlow2, on AlexNet, VGG, ResNet-18, ResNet-50, ResNet-101, and ResNet-152, respectively.

View More Papers

Practical Non-Interactive Searchable Encryption with Forward and Backward Privacy

Shi-Feng Sun (Monash University, Australia), Ron Steinfeld (Monash University, Australia), Shangqi Lai (Monash University, Australia), Xingliang Yuan (Monash University, Australia), Amin Sakzad (Monash University, Australia), Joseph Liu (Monash University, Australia), ‪Surya Nepal‬ (Data61, CSIRO, Australia), Dawu Gu (Shanghai Jiao Tong University, China)

Read More

Does Every Second Count? Time-based Evolution of Malware Behavior...

Alexander Küchler (Fraunhofer AISEC), Alessandro Mantovani (EURECOM), Yufei Han (NortonLifeLock Research Group), Leyla Bilge (NortonLifeLock Research Group), Davide Balzarotti (EURECOM)

Read More

Data Analytics and Expert Judgment in Time of Crisis:...

Igor Linkov, PhD Senior Science and Technology Manager, US Army Engineer Research and Development Center; Senior Data Analyst (on detail), FEMA/HHS R1 COVID Task Force; Adjunct Professor, Carnegie Mellon University

Read More

Demo #5: Securing Heavy Vehicle Diagnostics

Jeremy Daily, David Nnaji, and Ben Ettlinger (Colorado State University)

Read More