Yutong Ye (Institute of software, Chinese Academy of Sciences & Zhongguancun Laboratory, Beijing, PR.China.), Tianhao Wang (University of Virginia), Min Zhang (Institute of Software, Chinese Academy of Sciences), Dengguo Feng (Institute of Software, Chinese Academy of Sciences)
This paper investigates the fundamental estimation problem in local differential privacy (LDP). We categorize existing estimation methods into two approaches, the unbiased estimation approach, which, under LDP, often gives unreasonable results (negative results or the sum of estimation does not equal to the total number of participating users), due to the excessive amount of noise added in LDP, and the maximal likelihood estimation (MLE)-based approach, which, can give reasonable results, but often suffers from the overfitting issue. To address this challenge, we propose a reduction framework inspired by Gaussian mixture models (GMM). We adapt the reduction framework to LDP estimation by transferring the estimation problem to the density estimation problem of the mixture model. Through the merging operation of the smallest weight component in this mixture model, the EM algorithm converges faster and produces a more robust distribution estimation. We show this framework offers a general and efficient way of modeling various LDP protocols. Through extensive evaluations, we demonstrate the superiority of our approach in terms of mean estimation, categorical distribution estimation, and numerical distribution estimation.