Jiayun Xu (Singapore Management University), Yingjiu Li (University of Oregon), Robert H. Deng (Singapore Management University)

A common problem in machine learning-based malware detection is that training data may contain noisy labels and it is challenging to make the training data noise-free at a large scale. To address this problem, we propose a generic framework to reduce the noise level of training data for the training of any machine learning-based Android malware detection. Our framework makes use of all intermediate states of two identical deep learning classification models during their training with a given noisy training dataset and generate a noise-detection feature vector for each input sample. Our framework then applies a set of outlier detection algorithms on all noise-detection feature vectors to reduce the noise level of the given training data before feeding it to any machine learning based Android malware detection approach. In our experiments with three different Android malware detection approaches, our framework can detect significant portions of wrong labels in different training datasets at different noise ratios, and improve the performance of Android malware detection approaches.

View More Papers

Hunting the Haunter — Efficient Relational Symbolic Execution for...

Lesly-Ann Daniel (CEA, List, France), Sébastien Bardin (CEA, List, France), Tamara Rezk (Inria, France)

Read More

Trust the Crowd: Wireless Witnessing to Detect Attacks on...

Kai Jansen (Ruhr University Bochum), Liang Niu (New York University), Nian Xue (New York University), Ivan Martinovic (University of Oxford), Christina Pöpper (New York University Abu Dhabi)

Read More

WINNIE : Fuzzing Windows Applications with Harness Synthesis and...

Jinho Jung (Georgia Institute of Technology), Stephen Tong (Georgia Institute of Technology), Hong Hu (Pennsylvania State University), Jungwon Lim (Georgia Institute of Technology), Yonghwi Jin (Georgia Institute of Technology), Taesoo Kim (Georgia Institute of Technology)

Read More

ALchemist: Fusing Application and Audit Logs for Precise Attack...

Le Yu (Purdue University), Shiqing Ma (Rutgers University), Zhuo Zhang (Purdue University), Guanhong Tao (Purdue University), Xiangyu Zhang (Purdue University), Dongyan Xu (Purdue University), Vincent E. Urias (Sandia National Laboratories), Han Wei Lin (Sandia National Laboratories), Gabriela Ciocarlie (SRI International), Vinod Yegneswaran (SRI International), Ashish Gehani (SRI International)

Read More