Nyteisha Bookert, Mohd Anwar (North Carolina Agricultural and Technical State University)
Patient-generated health data is growing at an unparalleled rate due to advancing technologies (e.g., the Internet of Medical Things, 5G, artificial intelligence) and increased consumer transactions. The influx of data has offered life-altering solutions. Consequently, the growth has created significant privacy challenges. A central theme to mitigating risks is promoting transparency and notifying stakeholders of data practices through privacy policies. However, natural language privacy policies have several limitations, such as being difficult to understand (by the user), lengthy, and having conflicting requirements. Yet they remain the de facto standard to inform users of privacy practices and how organizations follow privacy regulations. We developed an automated process to evaluate the appropriateness of combining machine learning and custom named entity recognition techniques to extract IoMT-relevant privacy factors in the privacy policies of IoMT devices. We employed machine learning and the natural language processing technique of named entity recognition to automatically analyze a corpus of policies and specifications to extract privacy-related information for the IoMT device. Based on the natural language analysis of policies, we provide fine-grained annotations that can help reduce the manual and tedious process of policy analysis and aid privacy engineers and policy makers in developing suitable privacy policies.