Miaoqian Lin (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Yi Yang (Institute of Information Engineering, Chinese Academy of…

Modern software often provides diverse APIs to facilitate development. Certain APIs, when used, can affect variables and require post-handling, such as error checks and resource releases. Developers should adhere to their usage specifications when using these APIs. Failure to do so can cause serious security threats, such as memory corruption and system crashes. Detecting such misuse depends on comprehensive API specifications, as violations of these specifications indicate API misuse. Previous studies have proposed extracting API specifications from various artifacts, including API documentation, usage patterns, and bug patches. However, these artifacts are frequently incomplete or unavailable for many APIs. As a result, the lack of specifications for uncovered APIs causes many false negatives in bug detection.

In this paper, we introduce the idea of API Specification Propagation, which suggests that API specifications propagate through hierarchical API call chains. In particular, modern software often adopts a hierarchical API design, where high-level APIs build on low-level ones. When high-level APIs wrap low-level ones, they may inherit the corresponding specifications. Based on this idea, we present APISpecGen, which uses known specifications as seeds and performs bidirectional propagation analysis to generate specifications for new APIs. Specifically, given the seed specifications, APISpecGen infers which APIs the specifications might propagate to or originate from. To further generate specifications for the inferred APIs, APISpecGen combines API usage and validates them using data-flow analysis based on the seed specifications. Besides, APISpecGen iteratively uses the generated specifications as new seeds to cover more APIs. For efficient and accurate analysis, APISpecGen focuses only on code relevant to the specifications, ignoring irrelevant semantics. We implemented APISpecGen and evaluated it for specification generation and API misuse detection. With 6 specifications as seeds, APISpecGen generated 7332 specifications. Most of the generated specifications could not be covered by state-of-the-art work due to the quality of their sources. With the generated specifications, APISpecGen detected 186 new bugs in the Linux kernel, 113 of them have been confirmed by the developers, with 8 CVEs assigned.

View More Papers

Automated Mass Malware Factory: The Convergence of Piggybacking and...

Heng Li (Huazhong University of Science and Technology), Zhiyuan Yao (Huazhong University of Science and Technology), Bang Wu (Huazhong University of Science and Technology), Cuiying Gao (Huazhong University of Science and Technology), Teng Xu (Huazhong University of Science and Technology), Wei Yuan (Huazhong University of Science and Technology), Xiapu Luo (The Hong Kong Polytechnic University)

Read More

Transparency or Information Overload? Evaluating Users’ Comprehension and Perceptions...

Xiaoyuan Wu (Carnegie Mellon University), Lydia Hu (Carnegie Mellon University), Eric Zeng (Carnegie Mellon University), Hana Habib (Carnegie Mellon University), Lujo Bauer (Carnegie Mellon University)

Read More

Do (Not) Follow the White Rabbit: Challenging the Myth...

Soheil Khodayari (CISPA Helmholtz Center for Information Security), Kai Glauber (Saarland University), Giancarlo Pellegrino (CISPA Helmholtz Center for Information Security)

Read More

AlphaDog: No-Box Camouflage Attacks via Alpha Channel Oversight

Qi Xia (University of Texas at San Antonio), Qian Chen (University of Texas at San Antonio)

Read More