He Shuang (University of Toronto), Lianying Zhao (Carleton University and University of Toronto), David Lie (University of Toronto)

Web tracking harms user privacy. As a result, the
use of tracker detection and blocking tools is a common practice
among Internet users. However, no such tool can be perfect,
and thus there is a trade-off between avoiding breakage (caused
by unintentionally blocking some required functionality) and ne-
glecting to block some trackers. State-of-the-art tools usually rely
on user reports and developer effort to detect breakages, which
can be broadly categorized into two causes: 1) misidentifying
non-trackers as trackers, and 2) blocking mixed trackers which
blend tracking with functional components.

We propose incorporating a machine learning-based break-
age detector into the tracker detection pipeline to automatically
avoid misidentification of functional resources. For both tracker
detection and breakage detection, we propose using differential
features that can more clearly elucidate the differences caused by
blocking a request. We designed and implemented a prototype of
our proposed approach, Duumviri, for non-mixed trackers. We
then adopt it to automatically identify mixed trackers, drawing
differential features at partial-request granularity.

In the case of non-mixed trackers, evaluating Duumviri on 15K
pages shows its ability to replicate the labels of human-generated
filter lists, EasyPrivacy, with an accuracy of 97.44%. Through a
manual analysis, we find that Duumviri can identify previously
unreported trackers and its breakage detector can identify overly
strict EasyPrivacy rules that cause breakage. In the case of mixed
trackers, Duumviri is the first automated mixed tracker detector,
and achieves a lower bound accuracy of 74.19%. Duumviri has
enabled us to detect and confirm 22 previously unreported unique
trackers and 26 unique mixed trackers.

View More Papers

Do (Not) Follow the White Rabbit: Challenging the Myth...

Soheil Khodayari (CISPA Helmholtz Center for Information Security), Kai Glauber (Saarland University), Giancarlo Pellegrino (CISPA Helmholtz Center for Information Security)

Read More

Balancing Privacy and Data Utilization: A Comparative Vignette Study...

Leona Lassak (Ruhr University Bochum), Hanna Püschel (TU Dortmund University), Oliver D. Reithmaier (Leibniz University Hannover), Tobias Gostomzyk (TU Dortmund University), Markus Dürmuth (Leibniz University Hannover)

Read More

On-demand RFID: Improving Privacy, Security, and User Trust in...

Youngwook Do (JPMorganChase and Georgia Institute of Technology), Tingyu Cheng (Georgia Institute of Technology and University of Notre Dame), Yuxi Wu (Georgia Institute of Technology and Northeastern University), HyunJoo Oh(Georgia Institute of Technology), Daniel J. Wilson (Northeastern University), Gregory D. Abowd (Northeastern University), Sauvik Das (Carnegie Mellon University)

Read More

Mnemocrypt

André Pacteau, Antonino Vitale, Davide Balzarotti, Simone Aonzo (EURECOM)

Read More