Xiaochen Li (University of Virginia), Zhan Qin (Zhejiang University), Kui Ren (Zhejiang University), Chen Gong (University of Virginia), Shuya Feng (University of Connecticut), Yuan Hong (University of Connecticut), Tianhao Wang (University of Virginia)
The research on tasks involving differentially private data stream releases has traditionally centered around real-time scenarios. However, not all data streams inherently demand real-time releases, and achieving such releases is challenging due to network latency and processing constraints in practical settings. We delve into the advantages of introducing a delay time in stream releases. Concentrating on the event-level privacy setting, we discover that incorporating a delay can overcome limitations faced by current approaches, thereby unlocking substantial potential for improving accuracy.
Building on these insights, we developed a framework for data stream releases that allows for delays. Capitalizing on data similarity and relative order characteristics, we devised two optimization strategies, group-based and order-based optimizations, to aid in reducing the added noise and post-processing of noisy data. Additionally, we introduce a novel sensitivity truncation mechanism, significantly further reducing the amount of introduced noise. Our comprehensive experimental results demonstrate that, on a data stream of length $18,319$, allowing a delay of $10$ timestamps enables the proposed approaches to achieve a remarkable up to a $30times$ improvement in accuracy compared to baseline methods.
Our code is open-sourced.