Faysal Hossain Shezan (University of Virginia), Zihao Su (University of Virginia), Mingqing Kang (Johns Hopkins University), Nicholas Phair (University of Virginia), Patrick William Thomas (University of Virginia), Michelangelo van Dam (in2it), Yinzhi Cao (Johns Hopkins University), Yuan Tian (UCLA)
WordPress, a well-known content management system (CMS), provides so-called plugins to augment default functionalities. One challenging problem of deploying WordPress plugins is that they may collect and process user data, such as Personal Identifiable Information (PII), which is usually regulated by laws such as General Data Protection Regulation (GDPR). To the best of our knowledge, no prior works have studied GDPR compliance in WordPress plugins, which often involve multiple program languages, such as PHP, JavaScript, HTML, and SQL.
In this paper, we design CHKPLUG, the first automated GDPR checker of WordPress plugins for their compliance with GDPR articles related to PII. The key to CHKPLUG is to match WordPress plugin behavior with GDPR articles using graph queries to a novel cross-language code property graph (CCPG). Specifically, the CCPG models both inline language integration (such as PHP and HTML) and key-value-related connection (such as HTML and JavaScript). CHKPLUG reports a GDPR violation if certain patterns are found in the CCPG.
We evaluated CHKPLUG with human-annotated WordPress plugins. Our evaluation shows that CHKPLUG achieves good performance with 98.8% TNR (True Negative Rate) and 89.3% TPR (True Positive Rate) in checking whether a certain WordPress plugin complies with GDPR. To investigate the current surface of the marketplace, we perform a measurement analysis which shows that 368 plugins violate data deletion regulations, meaning plugins do not provide any functionalities to erase user information from the website.