Miaoqian Lin (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Yi Yang (Institute of Information Engineering, Chinese Academy of…

Modern software often provides diverse APIs to facilitate development. Certain APIs, when used, can affect variables and require post-handling, such as error checks and resource releases. Developers should adhere to their usage specifications when using these APIs. Failure to do so can cause serious security threats, such as memory corruption and system crashes. Detecting such misuse depends on comprehensive API specifications, as violations of these specifications indicate API misuse. Previous studies have proposed extracting API specifications from various artifacts, including API documentation, usage patterns, and bug patches. However, these artifacts are frequently incomplete or unavailable for many APIs. As a result, the lack of specifications for uncovered APIs causes many false negatives in bug detection.

In this paper, we introduce the idea of API Specification Propagation, which suggests that API specifications propagate through hierarchical API call chains. In particular, modern software often adopts a hierarchical API design, where high-level APIs build on low-level ones. When high-level APIs wrap low-level ones, they may inherit the corresponding specifications. Based on this idea, we present APISpecGen, which uses known specifications as seeds and performs bidirectional propagation analysis to generate specifications for new APIs. Specifically, given the seed specifications, APISpecGen infers which APIs the specifications might propagate to or originate from. To further generate specifications for the inferred APIs, APISpecGen combines API usage and validates them using data-flow analysis based on the seed specifications. Besides, APISpecGen iteratively uses the generated specifications as new seeds to cover more APIs. For efficient and accurate analysis, APISpecGen focuses only on code relevant to the specifications, ignoring irrelevant semantics. We implemented APISpecGen and evaluated it for specification generation and API misuse detection. With 6 specifications as seeds, APISpecGen generated 7332 specifications. Most of the generated specifications could not be covered by state-of-the-art work due to the quality of their sources. With the generated specifications, APISpecGen detected 186 new bugs in the Linux kernel, 113 of them have been confirmed by the developers, with 8 CVEs assigned.

View More Papers

Transparency or Information Overload? Evaluating Users’ Comprehension and Perceptions...

Xiaoyuan Wu (Carnegie Mellon University), Lydia Hu (Carnegie Mellon University), Eric Zeng (Carnegie Mellon University), Hana Habib (Carnegie Mellon University), Lujo Bauer (Carnegie Mellon University)

Read More

Formally Verifying the Newest Versions of the GNSS-centric TESLA...

Ioana Boureanu, Stephan Wesemeyer (Surrey Centre for Cyber Security, University of Surrey)

Read More

Enhancing Security in Third-Party Library Reuse – Comprehensive Detection...

Shangzhi Xu (The University of New South Wales), Jialiang Dong (The University of New South Wales), Weiting Cai (Delft University of Technology), Juanru Li (Feiyu Tech), Arash Shaghaghi (The University of New South Wales), Nan Sun (The University of New South Wales), Siqi Ma (The University of New South Wales)

Read More

PQConnect: Automated Post-Quantum End-to-End Tunnels

Daniel J. Bernstein (University of Illinois at Chicago and Academia Sinica), Tanja Lange (Eindhoven University of Technology amd Academia Sinica), Jonathan Levin (Academia Sinica and Eindhoven University of Technology), Bo-Yin Yang (Academia Sinica)

Read More