Shridatt Sugrim (Rutgers University), Can Liu (Rutgers University), Meghan McLean (Rutgers University), Janne Lindqvist (Rutgers University)

Research has produced many types of authentication systems that use machine learning. However, there is no consistent approach for reporting performance metrics and the reported metrics are inadequate. In this work, we show that several of the common metrics used for reporting performance, such as maximum accuracy (ACC), equal error rate (EER) and area under the ROC curve (AUROC), are inherently flawed. These common metrics hide the details of the inherent trade-offs a system must make when implemented. Our findings show that current metrics give no insight into how system performance degrades outside the ideal conditions in which they were designed. We argue that adequate performance reporting must be provided to enable meaningful evaluation and that current, commonly used approaches fail in this regard. We present the unnormalized frequency count of scores (FCS) to demonstrate the mathematical underpinnings that lead to these failures and show how they can be avoided. The FCS can be used to augment the performance reporting to enable comparison across systems in a visual way. When reported with the Receiver Operating Characteristics curve (ROC), these two metrics provide a solution to the limitations of currently reported metrics. Finally, we show how to use the FCS and ROC metrics to evaluate and compare different authentication systems.

View More Papers

maTLS: How to Make TLS middlebox-aware?

Hyunwoo Lee (Seoul National University), Zach Smith (University of Luxembourg), Junghwan Lim (Seoul National University), Gyeongjae Choi (Seoul National University), Selin Chun (Seoul National University), Taejoong Chung (Rochester Institute of Technology), Ted "Taekyoung" Kwon (Seoul National University)

Read More

Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation

Victor Le Pochat (imec-DistriNet, KU Leuven), Tom Van Goethem (imec-DistriNet, KU Leuven), Samaneh Tajalizadehkhoob (Delft University of Technology), Maciej Korczyński (Grenoble Alps University), Wouter Joosen (imec-DistriNet, KU Leuven)

Read More

Measuring the Facebook Advertising Ecosystem

Athanasios Andreou (EURECOM), Márcio Silva (UFMG), Fabrício Benevenuto (UFMG), Oana Goga (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Patrick Loiseau (Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG & MPI-SWS), Alan Mislove (Northeastern University)

Read More

NAUTILUS: Fishing for Deep Bugs with Grammars

Cornelius Aschermann (Ruhr-Universität Bochum), Tommaso Frassetto (Technische Universität Darmstadt), Thorsten Holz (Ruhr-Universität Bochum), Patrick Jauernig (Technische Universität Darmstadt), Ahmad-Reza Sadeghi (Technische Universität Darmstadt), Daniel Teuchert (Ruhr-Universität Bochum)

Read More