NDSS Symposium 2020 Programme
Sunday, 23 February
Monday, 24 February
-
Paul Forney, CISSP-ISSAP, CSSLP, CCSP, GREM
-
Taekjin Lee (KAIST, ETRI), Seongil Wi (KAIST), Suyoung Lee (KAIST), Sooel Son (KAIST)
An Unrestricted File Upload (UFU) vulnerability is a critical security threat that enables an adversary to upload her choice of a forged file to a target web server. This bug evolves into an Unrestricted Executable File Upload (UEFU) vulnerability when the adversary is able to conduct remote code execution of the uploaded file via triggering its URL. We design and implement FUSE, the first penetration testing tool designed to discover UFU and UEFU vulnerabilities in server-side PHP web applications. The goal of FUSE is to generate upload requests; each request becomes an exploit payload that triggers a UFU or UEFU vulnerability. However, this approach entails two technical challenges: (1) it should generate an upload request that bypasses all content-filtering checks present in a target web application; and (2) it should preserve the execution semantic of the resulting uploaded file. We address these technical challenges by mutating standard upload requests with carefully designed mutation operations that enable the bypassing of content- filtering checks and do not tamper with the execution of uploaded files. FUSE discovered 30 previously unreported UEFU vulnerabilities, including 15 CVEs from 33 real-world web applications, thereby demonstrating its efficacy in finding code execution bugs via file uploads.
-
Takuya Watanabe (NTT), Eitaro Shioji (NTT), Mitsuaki Akiyama (NTT), Tatsuya Mori (Waseda University, NICT, and RIKEN AIP)
Intermediary web services such as web proxies, web translators, and web archives have become pervasive as a means to enhance the openness of the web. These services aim to remove the intrinsic obstacles to web access; i.e., access blocking, language barriers, and missing web pages. In this study, we refer to these services as web rehosting services and make the first exploration of their security flaws. The web rehosting services use a single domain name to rehost several websites that have distinct domain names; this characteristic makes web rehosting services intrinsically vulnerable to violating the same origin policy if not operated carefully. Based on the intrinsic vulnerability of web rehosting services, we demonstrate that an attacker can perform five different types of attacks that target users who make use of web rehosting services: persistent man-in-the-middle attack, abusing privileges to access various resources, stealing credentials, stealing browser history, and session hijacking/injection. Our extensive analysis of 21 popular web rehosting services, which have more than 200 million visits per day, revealed that these attacks are feasible. In response to this observation, we provide effective countermeasures against each type of attack.
-
Giada Stivala (CISPA Helmholtz Center for Information Security), Giancarlo Pellegrino (CISPA Helmholtz Center for Information Security)
Social media has become a primary mean of content and information sharing, thanks to its speed and simplicity. In this scenario, link previews play the important role of giving a meaningful first glance to users, summarizing the content of the shared webpage within its title, description and image. In our work, we analyzed the preview-rendering process, observing how it is possible to misuse it to obtain benign-looking previews for malicious links. Concrete use-case of this research field is phishing and spam spread, considering targeted attacks in addition to large-scale campaigns.
We designed a set of experiments for 20 social media platforms including social networks and instant messenger applications and found out how most of the platforms follow their own preview design and format, sometimes providing partial information. Four of these platforms allow preview crafting so as to hide the malicious target even to a tech-savvy user, and we found that it is possible to create misleading previews for the remaining 16 platforms when an attacker can register their own domain. We also observe how 18 social media platforms do not employ active nor passive countermeasures against the spread of known malicious links or software, and that existing cross-checks on malicious URLs can be bypassed through client- and server-side redirections. To conclude, we suggest seven recommendations covering the spectrum of our findings, to improve the overall preview-rendering mechanism and increase users' overall trust in social media platforms.
-
Avinash Sudhodanan (IMDEA Software Institute), Soheil Khodayari (CISPA Helmholtz Center for Information Security), Juan Caballero (IMDEA Software Institute)
In a Cross-Origin State Inference (COSI) attack, an attacker convinces a victim into visiting an attack web page, which leverages the cross-origin interaction features of the victim’s web browser to infer the victim’s state at a target web site. Multiple instances of COSI attacks have been found in the past under different names such as login detection or access detection attacks. But, those attacks only consider two states (e.g., logged in or not) and focus on a specific browser leak method (or XS-Leak).
This work shows that mounting more complex COSI attacks such as deanonymizing the owner of an account, determining if the victim owns sensitive content, and determining the victim’s account type often requires considering more than two states. Furthermore, robust attacks require supporting a variety of browsers since the victim’s browser cannot be predicted apriori. To address these issues, we present a novel approach to identify and build complex COSI attacks that differentiate more than
two states and support multiple browsers by combining multiple attack vectors, possibly using different XS-Leaks. To enable our approach, we introduce the concept of a COSI attack class. We propose two novel techniques to generalize existing COSI attack instances into COSI attack classes and to discover new COSI attack classes. We systematically study existing attacks and apply our techniques to them, identifying 40 COSI attack classes. As part of this process, we discover a novel XS-Leak based on window.postMessage. We implement our approach into Basta-COSI, a tool to find COSI attacks in a target web site. We apply Basta-COSI to test four stand-alone web applications and 58 popular web sites, finding COSI attacks against each of them. -
Soroush Karami (University of Illinois at Chicago), Panagiotis Ilia (University of Illinois at Chicago), Konstantinos Solomos (University of Illinois at Chicago), Jason Polakis (University of Illinois at Chicago)
With users becoming increasingly privacy-aware and browser vendors incorporating anti-tracking mechanisms, browser fingerprinting has garnered significant attention. Accordingly, prior work has proposed techniques for identifying browser extensions and using them as part of a device's fingerprint. While previous studies have demonstrated how extensions can be detected through their web accessible resources, there exists a significant gap regarding techniques that indirectly detect extensions through behavioral artifacts. In fact, no prior study has demonstrated that this can be done in an automated fashion. In this paper, we bridge this gap by presenting the first fully automated creation and detection of behavior-based extension fingerprints. We also introduce two novel fingerprinting techniques that monitor extensions' communication patterns, namely outgoing HTTP requests and intra-browser message exchanges. These techniques comprise the core of Carnus, a modular system for the static and dynamic analysis of extensions, which we use to create the largest set of extension fingerprints to date. We leverage our dataset of 29,428 detectable extensions to conduct a comprehensive investigation of extension fingerprinting in realistic settings and demonstrate the practicality of our attack. Our experimental evaluation against a state-of-the-art countermeasure confirms the robustness of our techniques as 87.92% of our behavior-based fingerprints remain effective.
Subsequently, we aim to explore the true extent of the privacy threat that extension fingerprinting poses to users, and present a novel study on the feasibility of inference attacks that reveal private and sensitive user information based on the functionality and nature of their extensions. We first collect over 1.44 million public user reviews of our detectable extensions, which provide a unique macroscopic view of the browser extension ecosystem and enable a more precise evaluation of the discriminatory power of extensions as well as a new deanonymization vector. We also automatically categorize extensions based on the developers' descriptions and identify those that can lead to the inference of personal data (religion, medical issues, etc.). Overall, our research sheds light on previously unexplored dimensions of the privacy threats of extension fingerprinting and highlights the need for more effective countermeasures that can prevent our attacks.
-
Sergej Schumilo (Ruhr-Universität Bochum), Cornelius Aschermann (Ruhr-Universität Bochum), Ali Abbasi (Ruhr-Universität Bochum), Simon Wörner (Ruhr-Universität Bochum), Thorsten Holz (Ruhr-Universität Bochum)
Applying modern fuzzers to novel targets is often a very lucrative venture. Hypervisors are part of a very critical code base: compromising them could allow an attacker to compromise the whole cloud infrastructure of any cloud provider. In this paper, we build a novel fuzzer that aims explicitly at testing modern hypervisors.
Our high throughput fuzzer design for long running interactive targets allows us to fuzz a large number of hypervisors, both open source, and proprietary. In contrast to one-dimensional fuzzers such as AFL, HYPER-CUBE can interact with any number of interfaces in any order.
Our evaluation shows that we can find more bugs (over 2x) and coverage (as much as 2x) than state of the art hypervisor fuzzers. Additionally, in most cases, we were able to do so using multiple orders of magnitude less time than comparable fuzzers. HYPER-CUBE was also able to rediscover a set of well-known vulnerabilities for hypervisors, such as VENOM, in less than five minutes. In total, HYPER-CUBE found 54 novel bugs, and so far we obtained 37 CVEs.
Our evaluation results demonstrates that next generation coverage-guided fuzzers should incorporate a higher-throughput design for long running targets such as hypervisors.
-
Kyungtae Kim (Purdue University), Dae R. Jeong (KAIST), Chung Hwan Kim (NEC Labs America), Yeongjin Jang (Oregon State University), Insik Shin (KAIST), Byoungyoung Lee (Seoul National University)
Hybrid fuzzing, combining symbolic execution and fuzzing, is a promising approach for vulnerability discovery because each approach can complement the other. However, we observe that applying hybrid fuzzing to kernel testing is challenging because the following unique characteristics of the kernel make a naive adoption of hybrid fuzzing inefficient: 1) having many implicit control transfers determined by syscall arguments, 2) controlling and matching internal system state via system calls, and 3) inferring nested argument type for invoking system calls. Failure to handling such challenges will render both fuzzing and symbolic execution inefficient, and thereby, will result in an inefficient hybrid fuzzing. Although these challenges are essential to both fuzzing and symbolic execution, however, to the best of our knowledge, existing kernel testing approaches either naively use each technique separately without handling such challenges or imprecisely handle a part of challenges only by static analysis.
To this end, this paper proposes HFL, which not only combines fuzzing with symbolic execution for hybrid fuzzing but also addresses kernel-specific fuzzing challenges via three distinct features: 1) converting implicit control transfers to explicit transfers, 2) inferring system call sequence to build a consistent system state, and 3) identifying nested arguments types of system calls. As a result, HFL found 24 previously unknown vulnerabilities in recent Linux kernels. Additionally, HFL achieves 14% higher code coverage than Syzkaller, and over S2E/TriforceAFL, achieving even eight times better coverage, using the same amount of resource (CPU, time, etc.). Regarding vulnerability discovery performance, HFL found 13 known vulnerabilities more than three times faster than Syzkaller.
-
William Blair (Boston University), Andrea Mambretti (Northeastern University), Sajjad Arshad (Northeastern University), Michael Weissbacher (Northeastern University), William Robertson (Northeastern University), Engin Kirda (Northeastern University), Manuel Egele (Boston University)
Fifteen billion devices run Java and many of them are connected to the Internet. As this ecosystem continues to grow, it remains an important task to discover the unknown security threats these devices face. Fuzz testing repeatedly runs software on random inputs in order to trigger unexpected program behaviors, such as crashes or timeouts, and has historically revealed serious security vulnerabilities. Contemporary fuzz testing techniques focus on identifying memory corruption vulnerabilities that allow adversaries to achieve remote code execution. Meanwhile, algorithmic complexity (AC) vulnerabilities, which are a common attack vector for denial-of-service attacks, remain an understudied threat.
In this paper, we present HotFuzz, a framework for automatically discovering AC vulnerabilities in Java libraries. HotFuzz uses micro-fuzzing, a genetic algorithm that evolves arbitrary Java objects in order to trigger the worst-case performance for a method under test. We define Small Recursive Instantiation (SRI) which provides seed inputs to micro-fuzzing represented as Java objects. After micro-fuzzing, HotFuzz synthesizes test cases that triggered AC vulnerabilities into Java programs and monitors their execution in order to reproduce vulnerabilities outside the analysis framework. HotFuzz outputs those programs that exhibit high CPU utilization as witnesses for AC vulnerabilities in a Java library.
We evaluate HotFuzz over the Java Runtime Environment (JRE), the 100 most popular Java libraries on Maven, and challenges contained in the DARPA Space and Time Analysis for Cyber-Security (STAC) program. We compare the effectiveness of using seed inputs derived using SRI against using empty values. In this evaluation, we verified known AC vulnerabilities, discovered previously unknown AC vulnerabilities that we responsibly reported to vendors, and received confirmation from both IBM and Oracle. Our results demonstrate micro-fuzzing finds AC vulnerabilities in real-world software, and that micro-fuzzing with SRI derived seed inputs complements using empty seed inputs.
-
Yanhao Wang (Institute of Software, Chinese Academy of Sciences), Xiangkun Jia (Pennsylvania State University), Yuwei Liu (Institute of Software, Chinese Academy of Sciences), Kyle Zeng (Arizona State University), Tiffany Bao (Arizona State University), Dinghao Wu (Pennsylvania State University), Purui Su (Institute of Software, Chinese Academy of Sciences)
Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With code coverage, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing, which aims to full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be patched promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding real-world vulnerabilities.
To resolve the issue, we propose coverage accounting, an innovative approach that evaluates code coverage by security impacts. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 5 state-of-the-art greybox and hybrid fuzzers (AFL, AFLFast, FairFuzz, QSYM, and Angora). TortoiseFuzz outperformed all greybox fuzzers and most hybrid fuzzers. It also had comparative results for other hybrid fuzzers yet consumed much fewer resources. Additionally, TortoiseFuzz found 18 new real-world vulnerabilities and has got 8 new CVEs so far. We will open source TortoiseFuzz to foster future research.
-
Sergey Frolov (University of Colorado Boulder), Jack Wampler (University of Colorado Boulder), Eric Wustrow (University of Colorado Boulder)
Censorship circumvention proxies have to resist active probing attempts, where censors connect to suspected servers and attempt to communicate using known proxy protocols. If the server responds in a way that reveals it is a proxy, the censor can block it with minimal collateral risk to other non-proxy services. Censors such as the Great Firewall of China have previously been observed using basic forms of this technique to find and block proxy servers as soon as they are used. In response, circumventors have created new “probe-resistant” proxy protocols, including obfs4, Shadowsocks, and Lampshade, that attempt to prevent censors from discovering them. These proxies require knowledge of a secret in order to use, and the servers remain silent when probed by a censor that doesn’t have the secret in an attempt to make it more difficult for censors to detect them.
In this paper, we identify ways that censors can still distinguish such probe-resistant proxies from other innocuous hosts on the Internet, despite their design. We discover unique TCP behaviors of five probe-resistant protocols used in popular circumvention software that could allow censors to effectively confirm suspected proxies with minimal false positives. We evaluate and analyze our attacks on hundreds of thousands of servers collected from a 10 Gbps university ISP vantage point over several days as well as active scanning using ZMap. We find that our attacks are able to efficiently identify proxy servers with only a handful of probing connections, with negligible false positives. Using our datasets, we also suggest defenses to these attacks that make it harder for censors to distinguish proxies from other common servers, and we work with proxy developers to implement these changes in several popular circumvention tools.
-
Reethika Ramesh (University of Michigan), Ram Sundara Raman (University of Michgan), Matthew Bernhard (University of Michigan), Victor Ongkowijaya (University of Michigan), Leonid Evdokimov (Independent), Anne Edmundson (Independent), Steven Sprecher (University of Michigan), Muhammad Ikram (Macquarie University), Roya Ensafi (University of Michigan)
Until now, censorship research has largely focused on highly centralized networks that rely on government-run technical choke-points, such as the Great Firewall of China. Although it was previously thought to be prohibitively difficult, large-scale censorship in decentralized networks are on the rise. Our in-depth investigation of the mechanisms underlying decentralized information control in Russia shows that such large-scale censorship can be achieved in decentralized networks through inexpensive commodity equipment. This new form of information control presents a host of problems for censorship measurement, including difficulty identifying censored content, requiring measurements from diverse perspectives, and variegated censorship mechanisms that require significant effort to identify in a robust manner.
By working with activists on the ground in Russia, we obtained five leaked blocklists signed by Roskomnadzor, the Russian government’s federal service for mass communications, along with seven years of historical blocklist data. This authoritative list contains domains, IPs, and subnets that ISPs have been required to block since November 1st, 2012. We used the blocklist from April 24 2019, that contains 132,798 domains, 324,695 IPs, and 39 subnets, to collect active measurement data from residential, data center and infrastructural vantage points. Our vantage points span 408 unique ASes that control ~ 65% of Russian IP address space.
Our findings suggest that data centers block differently from the residential ISPs both in quantity and in method of blocking, resulting in different experiences of the Internet for residential network perspectives and data center perspectives. As expected, residential vantage points experience high levels of censorship. While we observe a range of blocking techniques, such as TCP/IP blocking, DNS manipulation, or keyword based filtering, we find that residential ISPs are more likely to inject blockpages with explicit notices to users when censorship is enforced. Russia’s censorship architecture is a blueprint, and perhaps a forewarning of what and how national censorship policies could be implemented in many other countries that have similarly diverse ISP ecosystems to Russia’s. Understanding decentralized control will be key to continuing to preserve Internet freedom for years to come.
-
Ram Sundara Raman (University of Michigan), Adrian Stoll (University of Michigan), Jakub Dalek (Citizen Lab, University of Toronto), Reethika Ramesh (University of Michigan), Will Scott (Independent), Roya Ensafi (University of Michigan)
Content filtering technologies are often used for Internet censorship, but even as these technologies have become cheaper and easier to deploy, the censorship measurement community lacks a systematic approach to monitor their proliferation. Past research has focused on a handful of specific filtering technologies, each of which required cumbersome manual detective work to identify. Researchers and policymakers require a more comprehensive picture of the state and evolution of censorship based on content filtering in order to establish effective policies that protect Internet freedom.
In this work, we present FilterMap, a novel framework that can scalably monitor content filtering technologies based on their blockpages. FilterMap first compiles in-network and new remote censorship measurement techniques to gather blockpages from filter deployments. We then show how the observed blockpages can be clustered, generating signatures for longitudinal tracking. FilterMap outputs a map of regions of address space in which the same blockpages appear (corresponding to filter deployments), and each unique blockpage is manually verified to avoid false positives.
By collecting and analyzing more than 379 million measurements from 45,000 vantage points against more than 18,000 sensitive test domains, we are able to identify filter deployments associated with 90 vendors and actors and observe filtering in 103 countries. We detect the use of commercial filtering technologies for censorship in 36 out of 48 countries labeled as 'Not Free' or 'Partly Free' by the Freedom House ''Freedom on the Net'' report. The unrestricted transfer of content filtering technologies have led to high availability, low cost, and highly effective filtering techniques becoming easier to deploy and harder to circumvent. Identifying these filtering deployments highlights policy and corporate social responsibility issues, and adds accountability to filter manufacturers. Our continued publication of FilterMap data will help the international community track the scope, scale and evolution of content-based censorship.
-
Zhongjie Wang (University of California, Riverside), Shitong Zhu (University of California, Riverside), Yue Cao (University of California, Riverside), Zhiyun Qian (University of California, Riverside), Chengyu Song (University of California, Riverside), Srikanth V. Krishnamurthy (University of California, Riverside), Kevin S. Chan (U.S. Army Research Lab), Tracy D. Braun (U.S. Army Research Lab)
A key characteristic of commonly deployed deep packet inspection (DPI) systems is that they implement a simpli- fied state machine of the network stack that often differs from that of the end hosts. The discrepancies between the two state machines have been exploited to bypass such DPI middleboxes. However, most prior approaches to do so rely on manually crafted adversarial packets, which not only is labor-intensive but may not work well across a plurality of DPI-based middleboxes. Our goal in this work is to develop an automated way to craft such candidate packets, targeting TCP implementations in particular. Our approach to achieve this goal hinges on the key insight that while the TCP state machines of DPI implementations are obscure, those of the end hosts are well established. Thus, in our system SYMTCP, using symbolic execution, we systematically explore the TCP implementation of an end host, identifying candidate packets that can reach critical points in the code (e.g., which causes the packets to be accepted or dropped/ignored); such automatically identified packets are then fed through the DPI middlebox to determine if a discrepancy is induced and the middlebox can be bypassed. We find that our approach is extremely effective. It can generate tens of thousands of candidate adversarial packets in less than an hour. When evaluating against multiple state-of-the-art DPI middleboxes such as Zeek and Snort, as well as a state-level censorship firewall, Great Firewall of China, we identify not only previously known evasion strategies, but also novel ones that were never previously reported (e.g., involving urgent pointer). The system can extend easily to test other combinations of operating systems and DPI middleboxes, and serve as a valuable testing tool of future DPIs’ robustness against evasion attempts.
-
Milad Nasr (University of Massachusetts Amherst), Hadi Zolfaghari (University of Massachusetts Amherst), Amir Houmansadr (University of Massachusetts Amherst), Amirhossein Ghafari (University of Massachusetts Amherst)
Existing censorship circumvention systems fail to offer reliable circumvention without sacrificing their users' QoS and privacy, or undertaking high costs of operation. We have designed and implemented a censorship circumvention system, called SwarmProxy (anonymized name), whose goal is to offer emph{effective censorship circumvention} to a large body of censored users, with emph{high QoS}, emph{low costs of operation}, and emph{adjustable privacy protection}. Towards this, we have made several key decisions in designing our system.
First, we argue that circumvention systems should not bundle strong privacy protections (like anonymity) with censorship circumvention. Additional privacy properties should be offered as optional features to the users of circumvention users, which can be enabled by specific users or on specific connections (perhaps by trading off QoS).
Second, we combine various state-of-the-art circumvention techniques (such as using censored clients to proxy circumvention traffic for other censored clients, using volunteer NATed proxies, and leveraging CDN hosting) to make SwarmProxy significantly resistant to blocking, while keeping its cost of operation small ($0.001 per censored client per month).
We have built and deployed SwarmProxy as a fully operational system with end-user GUI software for major operating systems. Our system has been in beta release for over a year with hundreds of users from major censoring countries testing it on a daily basis.
A key part of SwarmProxy's design is using non-censored Internet users to run volunteer proxies to help censored users. We have performed the first user study on the willingness of typical Internet users in helping circumvention operators.
We have used the findings of our user study in the design of SwarmProxy to encourage wide adoption by volunteers; particularly, our GUI software offers high transparency, control, and safety to the volunteers.
-
Yanzi Zhu (UC Santa Barbara), Zhujun Xiao (University of Chicago), Yuxin Chen (University of Chicago), Zhijing Li (UC Santa Barbara), Max Liu (University of Chicago), Ben Y. Zhao (University of Chicago), Heather Zheng (University of Chicago)
Wireless devices are everywhere, constantly bombarding us with transmissions across a wide range of RF frequencies. Many of these invisible transmissions reflect off our bodies, carrying off information about our location, movement, and other physiological properties. While a boon to professionals with carefully calibrated instruments, they may also be revealing our physical
status to potential attackers nearby.Our work demonstrates a new set of silent reconnaissance attacks that leverages the presence of commodity WiFi devices to track users inside private homes and offices, without compromising any WiFi network, data packets, or devices. We show that just by sniffing existing WiFi signals, an
adversary can accurately detect and track movements of users inside a building. This is made possible by our new signal model that links together human motion near WiFi transmitters and variance of multipath signal propagation seen by the attacker sniffer outside of the property.
These attacks are cheap, highly effective, and difficult to detect. We implement
the attack using a single commodity smartphone, and deploy it in 11 real-world offices and residential apartments, and show it is highly effective. Finally, we evaluate potential defenses, and
propose a practical and effective defense based on AP signal obfuscation. -
Tao Chen (City University of Hong Kong), Longfei Shangguan (Microsoft), Zhenjiang Li (City University of Hong Kong), Kyle Jamieson (Princeton University)
This paper presents Metamorph, a system that generates imperceptible audio that can survive over-the-air transmission to attack the neural network of a speech recognition system. The key challenge stems from how to ensure the added perturbation of the original audio in advance at the sender side is immune to unknown signal distortions during the transmission process. Our empirical study reveals that signal distortion is mainly due to device and channel frequency selectivity but with different characteristics. This brings a chance to capture and further pre-code this impact to generate adversarial examples that are robust to the over-the-air transmission. We leverage this opportunity in Metamorph and obtain an initial perturbation that captures the core distortion's impact from only a small set of prior measurements, and then take advantage of a domain adaptation algorithm to refine the perturbation to further improve the attack distance and reliability. Moreover, we consider also reducing human perceptibility of the added perturbation. Evaluation achieves a high attack success rate (95%) over the attack distance of up to 6 m. Within a moderate distance, e.g., 3 m, Metamorph maintains a high success rate (98%), yet can be further adapted to largely improve the audio quality, confirmed by a human perceptibility study.
-
Qiben Yan (Michigan State University), Kehai Liu (Chinese Academy of Sciences), Qin Zhou (University of Nebraska-Lincoln), Hanqing Guo (Michigan State University), Ning Zhang (Washington University in St. Louis)
With recent advances in artificial intelligence and natural language processing, voice has become a primary method for human-computer interaction, which has enabled game-changing new technologies in both commercial sector such as Siri, Alexa or Google Assistant and the military sector in voice-controlled naval warships. Recently, researchers have demonstrated that these voice assistant systems are susceptible to signal injection from voice commands at the inaudible frequency. To date, most of the existing work focus primarily on delivering a single command via line-of-sight ultrasound speaker and extending the range of this attack via speaker array. However, sound waves also propagate through other materials where vibration is possible. In this work, we aim to understand the characteristics of this new genre of attack in the context of different transmission media. Furthermore, by leveraging the unique properties of acoustic transmission in solid materials, we design a new attack called SurfingAttack that will allow multiple rounds of interactions with the voice-controlled device over a longer distance and without the need to be in line-of-sight, resulting in minimal change to the physical environment. This has greatly elevated the potential risk of inaudible sound attack, enabling many new attack scenarios, such as hijacking a mobile Short Message Service (SMS) passcode, making ghost fraud calls without owners' knowledge, etc. To accomplish SurfingAttack, we have solved several major challenges. First, the signal has been specially designed to allow omni-directional transmission for performing effective attacks over a solid medium. Second, the new attack enables two-way communication without alerting the legitimate user at the scene, which is challenging since the device is designed to interact with human in physical proximity rather than sensors. To mitigate this newly discovered threat, we also provide discussions and experimental results on potential countermeasures to defend against this new threat.
-
Rahmadi Trimananda (University of California, Irvine), Janus Varmarken (University of California, Irvine), Athina Markopoulou (University of California, Irvine), Brian Demsky (University of California, Irvine)
Smart home devices are vulnerable to passive inference attacks based on network traffic, even in the presence of encryption. In this paper, we present PINGPONG, a tool that can automatically extract packet-level signatures for device events (e.g., light bulb turning ON/OFF) from network traffic. We evaluated PINGPONG on popular smart home devices ranging from smart plugs and thermostats to cameras, voice-activated devices, and smart TVs. We were able to: (1) automatically extract previously unknown signatures that consist of simple sequences of packet lengths and directions; (2) use those signatures to detect the devices or specific events with an average recall of more than 97%; (3) show that the signatures are unique among hundreds of millions of packets of real world network traffic; (4) show that our methodology is also applicable to publicly available datasets; and (5) demonstrate its robustness in different settings: events triggered by local and remote smartphones, as well as by home automation systems.
-
Zhongjie Ba (Zhejiang University and McGill University), Tianhang Zheng (University of Toronto), Xinyu Zhang (Zhejiang University), Zhan Qin (Zhejiang University), Baochun Li (University of Toronto), Xue Liu (McGill University), Kui Ren (Zhejiang University)
Motion sensors on current smartphones have been exploited for audio eavesdropping due to their sensitivity to vibrations. However, this threat is considered low-risk because of two widely acknowledged limitations: First, unlike microphones, motion sensors can only pick up speech signals traveling through a solid medium. Thus the only feasible setup reported previously is to use a smartphone gyroscope to eavesdrop on a loudspeaker placed on the same table. The second limitation comes from a common sense that these sensors can only pick up a narrow band (85-100Hz) of speech signals due to a sampling ceiling of 200Hz. In this paper, we revisit the threat of motion sensors to speech privacy and propose AccelEve, a new side-channel attack that employs a smartphone's accelerometer to eavesdrop on the speaker in the same smartphone. Specifically, it utilizes the accelerometer measurements to recognize the speech emitted by the speaker and to reconstruct the corresponding audio signals. In contrast to previous works, our setup allows the speech signals to always produce strong responses in accelerometer measurements through the shared motherboard, which successfully addresses the first limitation and allows this kind of attacks to penetrate into real-life scenarios. Regarding the sampling rate limitation, contrary to the widely-held belief, we observe up to 500Hz sampling rates in recent smartphones, which almost covers the entire fundamental frequency band (85-255Hz) of adult speech. On top of these pivotal observations, we propose a novel deep learning based system that learns to recognize and reconstruct speech information from the spectrogram representation of acceleration signals. This system employs adaptive optimization on deep neural networks with skip connections using robust and generalizable losses to achieve robust recognition and reconstruction performance. Extensive evaluations demonstrate the effectiveness and high accuracy of our attack under various settings.
-
Haohuang Wen (The Ohio State University), Qingchuan Zhao (The Ohio State University), Qi Alfred Chen (University of California, Irvine), Zhiqiang Lin (The Ohio State University)
In modern automobiles, CAN bus commands are necessary for a wide range of functionalities such as diagnosis, security monitoring, and recently autonomous driving. However, their specifications are developed privately by car manufacturers, and today the most effective way of revealing the proprietary CAN bus commands is to reverse engineer (e.g., dynamic test) with real cars, which is time consuming, costly, and error-prone. In this paper, we propose a cost-effective (no real car needed) and automatic (no human intervention required) approach for reverse engineering CAN bus commands using just car companion mobile apps. To achieve high effectiveness, we design a new technique to uncover the syntactics of CAN bus commands with backward slicing and dynamic forced execution, and a novel program-based algorithm to uncover the semantics of CAN bus commands by leveraging code-level semantics clues. We have implemented a prototype for both Android and iOS platforms, and tested it with all free car companion apps (253 in total) from both Google Play and Apple App Store. Among these apps, CANHUNTER discovered 182,619 syntactically unique CAN bus commands with 86% of them revealed with semantics, covering 360 car models from 21 car manufactures. We have also evaluated their correctness (both syntactics and semantics) using public resources, cross-platform and cross-app validation, and also real-car testing, in which 70% of all the uncovered commands are validated. We observe no inconsistency in cross-platform and cross-app validation, and only discover 3 false positives (among the 241 manually validated CAN bus commands) in semantics recovery from public resources and real-car testing.
-
Imani N. Sherman (University of Florida), Jasmine D. Bowers (University of Florida), Keith McNamara Jr. (University of Florida), Juan E. Gilbert (University of Florida), Jaime Ruiz (University of Florida), Patrick Traynor (University of Florida)
Robocalls are inundating phone users. These automated calls allow for attackers to reach massive audiences with scams ranging from credential hijacking to unnecessary IT support in a largely untraceable fashion. In response, many applications have been developed to alert mobile phone users of incoming robocalls. However, how well these applications communicate risk with their users is not well understood. In this paper, we identify common real-time security indicators used in the most popular anti-robocall applications. Using focus groups and user testing, we first identify which of these indicators most effectively alert users of danger. We then demonstrate that the most powerful indicators can reduce the likelihood that users will answer such calls by as much as 43%. Unfortunately, our evaluation also shows that attackers can eliminate the gains provided by such indicators using a small amount of target-specific information (e.g., a known phone number). In so doing, we demonstrate that anti-robocall indicators could benefit from significantly increased attention from the research community.
-
Faysal Hossain Shezan (University of Virginia), Kaiming Cheng (University of Virginia), Zhen Zhang (Johns Hopkins University), Yinzhi Cao (Johns Hopkins University), Yuan Tian (University of Virginia)
Permission-based access control enables users to manage and control their sensitive data for third-party applications. In an ideal scenario, third-party application includes enough details to illustrate the usage of such data, while the reality is that many descriptions of third-party applications are vague about their security or privacy activities. As a result, users are left with insufficient details when granting sensitive data to these applications.
Prior works, such as WHYPER and AutoCog, have addressed the aforementioned problem via a so-called permission correlation system. Such a system correlates third-party applications' description with their requested permissions and determines an application as overprivileged if a mismatch is found. However, although prior works are successful on their own platforms, such as Android eco-system, they are not directly applicable to new platforms, such as Chrome extensions and IFTTT, without extensive data labeling and parameter tuning.
In this paper, we design, implement, and evaluate a novel system, called TKPERM, which transfers knowledges of permission correlation systems across platforms. Our key idea is that these varied platforms with different use cases---like smartphones, IoTs, and desktop browsers---are all user-facing and thus allow the knowledges to be transferrable across platforms. Particularly, we adopt a greedy selection algorithm that picks the best source domains to transfer to the target permission on a new platform.
TKPERM achieves 90.02% overall F1 score after transfer, which is 12.62% higher than the one of a model trained directly on the target domain without transfer. Particularly, TKPERM has 91.83% F1 score on IFTTT, 89.13% F1 score on Chrome-Extension, and 89.1% F1 score on SmartThings. TKPERM also successfully identified many real-world overprivileged applications, such as a gaming hub requesting location permissions without legitimate use.
-
Thijs van Ede (University of Twente), Riccardo Bortolameotti (Bitdefender), Andrea Continella (UC Santa Barbara), Jingjing Ren (Northeastern University), Daniel J. Dubois (Northeastern University), Martina Lindorfer (TU Wien), David Choffnes (Northeastern University), Maarten van Steen (University of Twente), Andreas Peter (University of Twente)
Mobile-application fingerprinting of network traffic is a valuable tool for many security solutions as it provides insights into the apps active on a network.
Unfortunately, existing techniques require prior knowledge of apps to be able to recognize them.
However, mobile environments are constantly evolving, i.e., apps are regularly installed, updated, and uninstalled.
Therefore, it is infeasible for existing fingerprinting approaches to cover all apps that may appear on a network.
Moreover, most mobile traffic is encrypted, shows similarities with other apps, e.g., due to common libraries or the use of content delivery networks, and depends on user input, further complicating the fingerprinting process.As a solution, we propose FlowPrint, an unsupervised approach for creating mobile app fingerprints from (encrypted) network traffic.
We automatically find temporal correlations among destination-related features of network traffic and use these correlations to generate app fingerprints.
As this approach is unsupervised, we are able to fingerprint previously unseen apps, something that existing techniques fail to achieve.
We evaluate our approach for both Android and iOS in the setting of app recognition where we achieve an accuracy of 89.2%, outperforming state-of-the-art solutions by 39.0%.
In addition, we show that our approach can detect previously unseen apps with a precision of 93.5%, detecting 72.3% of apps within the first five minutes of communication.
-
George Bissias (University of Massachusetts Amherst), Brian N. Levine (University of Massachusetts Amherst)
Blockchain systems are designed to produce blocks at a constant average rate. The most popular systems currently employ a Proof of Work (PoW) algorithm as a means of creating these blocks. An unfortunate limitation of all deployed PoW blockchain systems is that the time between blocks has high variance. For example, Bitcoin produces, on average, one block every 10 minutes. However, 5% of the time, Bitcoin's inter-block time is at least 40 minutes.
In this paper, we show that high variance is at the root of fundamental attacks on PoW blockchains. We propose an alternative process for PoW-based block discovery that results in an inter-block time with significantly lower variance. Our algorithm, called _Bobtail_, generalizes the current algorithm, which uses a single PoW sample, to one that incorporates $k$ samples. We show that the variance of inter-block times decreases as $k$ increases. Bobtail significantly thwarts doublespend and selfish mining attacks. For example, for Bitcoin and Ethereum, a doublespending attacker with 40% of the mining power will succeed with 53% probability when the merchant sets up an embargo of 1 block; however, when $kgeq40$, the probability of success for the same attacker falls to less than 1%. Similarly, for Bitcoin and Ethereum currently, a selfish miner with 49% of the mining power will claim about 95% of blocks; however, when $kgeq20$, the same miner will find that selfish mining is less successful than honest mining. We also investigate attacks newly made possible by Bobtail and show how they can be defeated. The primary costs of our approach are larger blocks and increased network traffic.
-
Vasilios Mavroudis (University College London), Karl Wüst (ETH Zurich), Aritra Dhar (ETH Zurich), Kari Kostiainen (ETH Zurich), Srdjan Capkun (ETH Zurich)
Permissionless blockchains offer many advantages but also have significant limitations including high latency. This prevents their use in important scenarios such as retail payments, where merchants should approve payments fast. Prior works have attempted to mitigate this problem by moving transactions off the chain. However, such Layer-2 solutions have their own problems: payment channels require a separate deposit towards each merchant and thus significant locked-in funds from customers; payment hubs require very large operator deposits that depend on the number of customers; and side-chains require trusted validators.
In this paper, we propose Snappy, a novel solution that enables recipients, like merchants, to safely accept fast payments. In Snappy, all payments are on the chain, while small customer collaterals and moderate merchant collaterals act as payment guarantees. Besides receiving payments, merchants also act as statekeepers who collectively track and approve incoming payments using majority voting. In case of a double-spending attack, the victim merchant can recover lost funds either from the collateral of the malicious customer or a colluding statekeeper (merchant). Snappy overcomes the main problems of previous solutions: a single customer collateral can be used to shop with many merchants; merchant collaterals are independent of the number of customers; and validators do not have to be trusted. Our Ethereum prototype shows that safe, fast (<2 seconds) and cheap payments are possible on existing blockchains.
-
Parinya Ekparinya (University of Sydney), Vincent Gramoli (University of Sydney and CSIRO-Data61), Guillaume Jourjon (CSIRO-Data61)
The vulnerability of traditional blockchains have been demonstrated at multiple occasions. Various companies are now moving towards Proof-of-Authority (PoA) blockchains with more conventional Byzantine fault tolerance, where a known set of n permissioned sealers, among which no more than t are Byzantine, seal blocks that include user transactions. Despite their wide adoption, these protocols were not proved correct.
In this paper, we present the Cloning Attack against the two mostly deployed PoA implementations of Ethereum, namely Aura and Clique. The Cloning Attack consists of one sealer cloning its pair of public-private keys into two distinct Ethereum instances that communicate with distinct groups of sealers. To identify their vulnerabilities, we first specify the corresponding algorithms. We then deploy one testnet for each protocol and demonstrate the success of the attack with only one Byzantine sealer. Finally, we propose counter-measures that prevent an adversary from double spending and introduce the necessary number of sealers needed to decide a block depending on n and t for both Aura and Clique to be safe.
-
Daniel Perez (Imperial College London), Benjamin Livshits (Imperial College London, UCL Centre for Blockchain Technologies, and Brave Software)
Metering is an approach developed to assign cost to smart contract execution in blockchain systems such as Ethereum. This paper presents a detailed investigation of the metering approach based on emph{gas} taken by the Ethereum blockchain. We discover a number of discrepancies in the metering model such as significant inconsistencies in the pricing of the instructions. We further demonstrate that there is very little correlation between the gas and resources such as CPU and memory. We find that the main reason for this is that the gas price is dominated by the amount of emph{storage} that is used.
Based on the observations above, we present a new type of DoS attack we call~emph{Resource Exhaustion Attack}, which uses these imperfections to generate low-throughput contracts. Using this method, we show that we are able to generate contracts with a throughput on average 50 times slower than typical contracts. These contracts can be used to prevent nodes with lower hardware capacity from participating in the network, thereby artificially reducing the level of centralization the network can deliver.
-
Venkat Arun (Massachusetts Institute of Technology), Aniket Kate (Purdue University), Deepak Garg (Max Planck Institute for Software Systems), Peter Druschel (Max Planck Institute for Software Systems), Bobby Bhattacharjee (University of Maryland)
For fear of retribution, the victim of a crime may be willing to report it only if other victims of the same perpetrator also step forward. Common examples include identifying oneself as the victim of sexual harassment by a person in a position of authority or accusing an influential politician, an authoritarian government or ones own employer of corruption. To handle such situations, legal literature has proposed the concept of an allegation escrow, a neutral third-party that collects allegations anonymously, matches allegations against each other, and de-anonymizes allegers only after de-anonymity thresholds (in terms of number of co-allegers), pre-specified by the allegers, are reached.
An allegation escrow can be realized as a single trusted third party; however, this party must be trusted to keep the identity of the alleger and content of the allegation private. To address this problem, this paper introduces Secure Allegation Escrows (SAE, pronounced ``say''). A SAE is a group of parties with independent interests and motives, acting jointly as an escrow for collecting allegations from individuals, matching the allegations, and de-anonymizing the allegations when designated thresholds are reached. By design, SAEs provide a very strong property: No less than a majority of parties constituting a SAE can de-anonymize or disclose the content of an allegation without a sufficient number of matching allegations (even in collusion with any number of other allegers). Once a sufficient number of matching allegations exist, all parties can simultaneously disclose the allegation with a verifiable proof of the allegers' identities. We describe how SAEs can be constructed using a novel authentication protocol and a novel allegation matching and bucketing algorithm, provide formal proofs of the security of our constructions, and provide an evaluation of a prototype implementation, demonstrating feasibility in practice.
Tuesday, 25 February
-
Jiahao Cao (Tsinghua University; George Mason University), Renjie Xie (Tsinghua University), Kun Sun (George Mason University), Qi Li (Tsinghua University), Guofei Gu (Texas A&M University), Mingwei Xu (Tsinghua University)
Software-Defined Networking (SDN) greatly meets the need in industry for programmable, agile, and dynamic networks by deploying diversified SDN applications on a centralized controller. However, SDN application ecosystem inevitably introduces new security threats since compromised or malicious applications can significantly disrupt network operations. A number of effective security enhancement systems have been developed to defend against potential attacks from SDN applications, including data provenance systems to protect applications from being poisoned by malicious applications, rule conflict detection systems to prevent data packets from bypassing network security policies, and application isolation systems to prevent applications from corrupting controllers. In this paper, we identify a new design flaw on flow rule installation in SDN, and this vulnerability can be exploited by malicious applications to launch effective attacks bypassing existing defense systems. We discover that SDN systems do not check the inconsistency between the buffer ID and match fields when an application attempts to install flow rules, so that a malicious application can manipulate the buffer ID to hijack buffered packets even though the installed flow rule from the application does not match the packet with that buffer ID. We name this new vulnerability as *buffered packet hijacking*, which can be exploited to launch attacks that disrupt all three SDN layers, namely, application layer, data plane layer, and control layer. First, by modifying buffered packets and resending them to controllers, a malicious application can poison other applications. Second, by manipulating forwarding behaviors of buffered packets, a malicious application can not only disrupt TCP connections of flows but also make flows bypass network security policies. Third, by copying massive buffered packets to controllers, a malicious application can saturate the bandwidth of the SDN control channel and computing resources. We demonstrate the feasibility and effectiveness of these attacks with both theoretical analysis and experiments in a real SDN testbed. Finally, we develop a lightweight defense system that can be readily deployed in existing SDN controllers as a patch.
-
Benjamin E. Ujcich (University of Illinois at Urbana-Champaign), Samuel Jero (MIT Lincoln Laboratory), Richard Skowyra (MIT Lincoln Laboratory), Steven R. Gomez (MIT Lincoln Laboratory), Adam Bates (University of Illinois at Urbana-Champaign), William H. Sanders (University of Illinois at Urbana-Champaign), Hamed Okhravi (MIT Lincoln Laboratory)
Software-defined networking (SDN) achieves a programmable control plane through the use of logically centralized, event-driven controllers and through network applications (apps) that extend the controllers’ functionality. As control plane decisions are often based on the data plane, it is possible for carefully-crafted malicious data plane inputs to direct the control plane towards unwanted states that bypass network security restrictions (i.e., cross-plane attacks). Unfortunately, due to the complex interplay between controllers, apps, and data plane inputs, at present it is difficult to systematically identify and analyze these cross-plane vulnerabilities.
We present EventScope, a vulnerability detection tool that automatically analyzes SDN control plane event usage, discovers candidate vulnerabilities based on missing event handling routines, and validates vulnerabilities based on data plane effects. To accurately detect missing event handlers without ground truth or developer aid, we cluster apps according to similar event usage and mark inconsistencies as candidates. We create an event flow graph to observe a global view of events and control flows within the control plane and use it to validate vulnerabilities that affect the data plane. We applied EventScope to the ONOS SDN controller and uncovered 14 new vulnerabilities.
-
Jonghoon Kwon (ETH), Taeho Lee (ETH), Claude Hähni (ETH), Adrian Perrig (ETH)
Network isolation is a critical modern Internet service. To date, network operators have created a logical network of distributed systems to provide communication isolation between different parties. However, the current network isolation is limited in scalability and flexibility. It limits the number of virtual networks and it only supports isolation at host (or virtual-machine) granularity. In this paper, we introduce Scalable Virtual Local Area Networking (SVLAN) that scales to a large number of distributed systems and offers improved flexibility in providing secure network isolation. With the notion of destination-driven reachability and packet-carrying forwarding state, SVLAN not only offers communication isolation but isolation can be specified at different granularities, e.g., per-application or per-process. Our proof-of-concept SVLAN implementation demonstrates its feasibility and practicality for real-world applications.
Saltaformaggio
-
Naif Saleh Almakhdhub (Purdue University and King Saud University), Abraham A. Clements (Sandia National Laboratories), Saurabh Bagchi (Purdue University), Mathias Payer (EPFL)
Embedded systems are deployed in security critical environments and have become a prominent target for remote attacks. Microcontroller-based systems (MCUS) are particularly vulnerable due to a combination of limited resources and low level programming which leads to bugs. Since MCUS are often a part of larger systems, vulnerabilities may jeopardize not just the security of the device itself but that of other systems as well. For example, exploiting a WiFi System on Chip (SoC) allows an attacker to hijack the smart phone's application processor.
Control-flow hijacking targeting the backward edge (e.g., Return-Oriented Programming--ROP) remains a threat for MCUS. Current defenses are either susceptible to ROP-style attacks or require special hardware such as a Trusted Execution Environment (TEE) that is not commonly available on MCUS.
We present µRAI, a compiler-based mitigation to emph{prevent} control-flow hijacking attacks targeting backward edges by enforcing the emph{Return Address Integrity (RAI)} property on MCUS. µRAI does not require any additional hardware such as TEE, making it applicable to the wide majority of MCUS. To achieve this, µRAI introduces a technique that moves return addresses from writable memory, to readable and executable memory. It re-purposes a single general purpose register that is never spilled, and uses it to resolve the correct return location. We evaluate against the different control-flow hijacking attacks scenarios targeting return addresses (e.g., arbitrary write), and demonstrate how µRAI prevents them all. Moreover, our evaluation shows that µRAI enforces its protection with negligible overhead.
-
Taemin Park (University of California, Irvine), Karel Dhondt (imec-DistriNet, KU Leuven), David Gens (University of California, Irvine), Yeoul Na (University of California, Irvine), Stijn Volckaert (imec-DistriNet, KU Leuven), Michael Franz (University of California, Irvine, USA)
Data-only attacks against dynamic scripting environments have become common. Web browsers and other modern applications embed scripting engines to support interactive content. The scripting engines optimize performance via just-in-time compilation. Since applications are increasingly hardened against code-reuse attacks, adversaries are looking to achieve code execution or elevate privileges by corrupting sensitive data like the intermediate representation of optimizing JIT compilers. This has inspired numerous defenses for just-in-time compilers.
Our paper demonstrates that securing JIT compilation is not sufficient. First, we present a proof-of-concept data-only attack against a recent version of Mozilla’s SpiderMonkey JIT in which the attacker only corrupts heap objects to successfully issue a system call from within bytecode execution at run time. Previous work assumed that bytecode execution is safe by construction since interpreters only allow a narrow set of benign instructions and bytecode is always checked for validity before execution. We show that this does not prevent malicious code execution in practice. Second, we design a novel defense, dubbed NoJITsu to protect complex, real-world scripting engines from data-only attacks against interpreted code. The key idea behind our defense is to allow fine-grained memory access control by analyzing, identifying, isolating, and protecting individual memory regions focusing on their role in code generation at any point in the JavaScript engine. For this we combine automated analysis and instrumentation, compartmentalization, and Intel’s Memory-Protection Keys to secure SpiderMonkey against previous and our new attack. We implement and thoroughly test our implementation using a number of real-world scenarios as well as standard benchmarks. We show that NoJITsu successfully thwarts code-reuse as well as data-only attacks against any part of the scripting engine while offering a modest run-time overhead of only 5%.
-
Ting Chen (University of Electronic Science and Technology of China), Rong Cao (University of Electronic Science and Technology of China), Ting Li (University of Electronic Science and Technology of China), Xiapu Luo (The Hong Kong Polytechnic University), Guofei Gu (Texas A&M University), Yufei Zhang (University of Electronic Science and Technology of China), Zhou Liao (University…
Smart contracts have become lucrative and profitable targets for attackers because they can hold a great amount of money. Although there are already many studies to discover the vulnerabilities in smart contracts, they can neither guarantee discovering all vulnerabilities nor protect the deployed smart ontracts against the attacks, because they rely on offline analysis. Recently, a few online protection approaches appeared but they only focus on specific attacks and cannot be easily extended to defend against other attacks. Developing a new online protection system for smart contracts from scratch is time-consuming and requires being familiar with the internals of smart contract runtime, thus making it difficult to quickly implement and deploy mechanisms to defend against new attacks.
In this paper, we propose a novel generic runtime protection framework named SPA for smart contracts on any blockchains that support Ethereum virtual machine (EVM). SPA distinguishes itself from existing online protection approaches through its capability, efficiency, and compatibility. First, SPA empowers users to easily develop and deploy protection apps for defending against various attacks by separating the information collection, attack detection and reaction with layered design. At the higher layer, SPA provides unified interfaces to develop protection apps gainst various attacks. At the lower layer, SPA instruments EVM to collect all primitive information necessary to detect various attacks and constructs 11 kinds of structural information for the ease of developing protection apps.
Based on SPA, users can develop new rotection apps in a few lines of code without modifying EVM and easily deploy them to the blockchain. Second, SPA is efficient, because we design on-demand information retrieval to reduce the overhead of information collection and adopt dynamic linking to eliminate the overhead of inter-process communication (IPC). It allows users to develop protection apps by using any programming languages that can generate dynamic link libraries (DLLs). Third, since more and more blockchains adopt EVM as smart contract runtime, SPA can be easily migrated to such blockchains without modifying the protection apps. Based on SPA, we develop 8 protection apps to defend against the attacks exploiting major vulnerabilities in smart contracts, and integrate SPA (including all protection apps) into 3 popular blockchains: Ethereum, Expanse and Wanchain. The extensive experimental results demonstrate the effectiveness and efficiency of SPA and our protection apps.
-
Victor Le Pochat (imec-DistriNet, KU Leuven), Tim Van hamme (imec-DistriNet, KU Leuven), Sourena Maroofi (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Tom Van Goethem (imec-DistriNet, KU Leuven), Davy Preuveneers (imec-DistriNet, KU Leuven), Andrzej Duda (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Wouter Joosen (imec-DistriNet, KU Leuven), Maciej Korczyński (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG)
In 2016, law enforcement dismantled the infrastructure of the Avalanche bulletproof hosting service, the largest takedown of a cybercrime operation so far. The malware families supported by Avalanche use Domain Generation Algorithms (DGAs) to generate random domain names for controlling their botnets. The takedown proactively targets these presumably malicious domains; however, as coincidental collisions with legitimate domains are possible, investigators must first classify domains to prevent undesirable harm to website owners and botnet victims.
The constraints of this real-world takedown (proactive decisions without access to malware activity, no bulk patterns and no active connections) mean that approaches from the state of the art cannot be applied. The problem of classifying thousands of registered DGA domain names therefore required an extensive, painstaking manual effort by law enforcement investigators. To significantly reduce this effort without compromising correctness, we develop a model that automates the classification. Through a synergetic approach, we achieve an accuracy of 97.6% with ground truth from the 2017 and 2018 Avalanche takedowns; for the 2019 takedown, this translates into a reduction of 76.9% in manual investigation effort. Furthermore, we interpret the model to provide investigators with insights into how benign and malicious domains differ in behavior, which features and data sources are most important, and how the model can be applied according to the practical requirements of a real-world takedown.
-
Tao Wang (Hong Kong University of Science and Technology)
Tor is an anonymity network that allows clients to browse web pages privately, but loading web pages with Tor is slow. To analyze how the browser loads web pages, we examine their resource trees using our new browser logging and simulation tool, BLAST. We find that the time it takes to load a web page with Tor is almost entirely determined by the number of round trips incurred, not its bandwidth, and Tor Browser incurs unnecessary round trips. Resources sit in the browser queue excessively waiting for the TCP, TLS or ALPN handshakes, each of which takes a separate round trip. We show that increasing resource loading capacity with larger pipelines and even HTTP/2 do not decrease load time because they do not save round trips.
We set out to minimize round trips with a number of protocol and browser improvements, including TCP Fast Open, optimistic data, zero-RTT TLS. We also recommend the use of databases to assist the client with redirection, identifying HTTP/2 servers, and prefetching. All of these features are designed to cut down on the number of round trips incurred in loading web pages. To evaluate these proposed improvements, we create a simulation tool and validate that it is highly accurate in predicting mean page load times. We use the simulator to analyze these features and it predicts that they will decrease the mean page load time by 61% in total over HTTP/2. Our large improvement to user experience comes at trivial cost to the Tor network.
-
Sandra Siby (EPFL), Marc Juarez (University of Southern California), Claudia Diaz (imec-COSIC KU Leuven), Narseo Vallina-Rodriguez (IMDEA Networks Institute), Carmela Troncoso (EPFL)
Virtually every connection to an Internet service is preceded by a DNS lookup which is performed without any traffic-level protection, thus enabling manipulation, redirection, surveillance, and censorship. To address these issues, large organizations such as Google and Cloudflare are deploying recently standardized protocols that encrypt DNS traffic between end users and recursive resolvers such as DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH). In this paper, we examine whether encrypting DNS traffic can protect users from traffic analysis-based monitoring and censoring. We propose a novel feature set to perform the attacks, as those used to attack HTTPS or Tor traffic are not suitable for DNS’ characteristics. We show that traffic analysis enables the identification of domains with high accuracy in closed and open world settings, using 124 times less data than attacks on HTTPS flows. We find that factors such as location, resolver, platform, or client do mitigate the attacks performance but they are far from completely stopping them. Our results indicate that DNS-based censorship is still possible on encrypted DNS traffic. In fact, we demonstrate that the standardized padding schemes are not effective. Yet, Tor — which does not effectively mitigate traffic analysis attacks on web traffic— is a good defense against DoH traffic analysis.
-
Teemu Rytilahti (Ruhr University Bochum), Thorsten Holz (Ruhr University Bochum)
Typical port scanning approaches do not achieve a full coverage of all devices connected to the Internet as not all devices are directly reachable via a public (IPv4) address: due to IP address space exhaustion, firewalls, and many other reasons, an end-to-end connectivity is not achieved in today’s Internet anymore. Especially Network Address Translation (NAT) is widely deployed in practice and it has the side effect of “hiding” devices from being scanned. Some protocols, however, require end-to-end connectivity to function properly and hence several methods were developed in the past to enable crossing network borders.
In this paper, we explore how an attacker can take advantage of such application-layer middlebox protocols to access devices hidden behind these gateways. More specifically, we investigate different methods for identifying such devices by (ab)using legitimate protocol features. We categorize the available protocols into two classes: First, there are persistent protocols that are typically port forwarding based. Such protocols are used to allow local network devices to open and forward external ports to them. Second, there are non-persistent protocols that are typically proxy-based to route packets between network edges, such as HTTP and SOCKS proxies. We perform a comprehensive, Internet-wide analysis to obtain an accurate overview of how prevalent and widespread such protocols are in practice. Our results indicate that hundreds of thousands of hosts are vulnerable for different types of attacks, e. g., we detect over 400.000 hosts that are likely vulnerable for attacks involving the UPnP IGD protocol. More worrisome, we find empirical evidence that attackers are already actively exploiting such protocols in the wild to access devices located behind NAT gateways. Amongst other findings, we discover that at least 24 % of all open Internet proxies are misconfigured to allow accessing hosts on non-routable addresses.
-
Ben Gras (Vrije Universiteit Amsterdam, Intel Corporation), Cristiano Giuffrida (Vrije Universiteit Amsterdam), Michael Kurth (Vrije Universiteit Amsterdam), Herbert Bos (Vrije Universiteit Amsterdam), Kaveh Razavi (Vrije Universiteit Amsterdam)
The past decade has seen a plethora of side channel attacks on various CPU components. Each new attack typically follows a whitebox analysis approach, which involves (i) identifying a specific shared CPU component, (ii) reversing its behavior on a specific microarchitecture, and (iii) surgically exploiting such knowledge to leak information (e.g., by actively evicting shared entries to monitor victim accesses). This approach requires a deep understanding of the target component, obtained by lengthy reverse engineering which needs to be repeated for each new component and each microarchitecture. It also does not allow for attacking shared resources that are unknown.
In this paper, we present ABSynthe, a system that takes a target program and a microarchitecture as inputs and automatically synthesizes new side channels. The key insight is that by limiting ourselves to (typically on-core) contention-based side channels, we can treat the target CPU microarchitecture as a black box, enabling automation. To make ABSynthe possible, we have automatically generated leakage maps for a variety of x86_64 microarchitectures. These leakage maps show a complex picture and justify a black box approach to finding the best sequence of instructions to cause information to leak from a software target. This target is also treated and analyzed as a blackbox, to find secret-dependent branches. To recover the secret information using the optimized sequence of instructions, ABSynthe relies on a recurrent neural network to craft practical side-channel attacks. Our evaluation, somewhat counter-intuitively, shows that ABSynthe can synthesize better attacks by exploiting contention on multiple components at the same time compared to state of the art contention-based attacks that focus on a single component. Concretely, the automation made possible by ABSynthe allows us to synthesize cross-thread attacks in different settings and for a variety of microarchitectures and cryptographic software targets, in both native and virtualized environments.
We present results for Intel, AMD and ARM microarchitetures, and 4 different cryptographic targets. As an example, ABSynthe can recover a full 256-bit EdDSA from just a single trace capture with 100% success rate on Intel.
-
Qinhan Tan (Zhejiang University), Zhihua Zeng (Zhejiang University), Kai Bu (Zhejiang University), Kui Ren (Zhejiang University)
Cache conflicts due to deterministic memory-to-cache mapping have long been exploited to leak sensitive information such as secret keys. While randomized mapping is fully investigated for L1 caches, it still remains unresolved about how to secure a much larger last-level cache (LLC). Recent solutions periodically change the mapping strategy to disrupt the crafting of conflicted addresses, which is a critical attack procedure to exploit cache conflicts. Remapping, however, increases both miss rate and access latency. We present PhantomCache for securing an LLC with remapping-free randomized mapping. We propose a localized randomization technique to bound randomized mapping of a memory address within only a limited number of cache sets. The small randomization space offers fast set search over an LLC in a memory access. The intrinsic randomness still suffices to obfuscate conflicts and disrupt efficient exploitation of conflicted addresses. We evaluate PhantomCache against an attacker exploring the state-of-the-art attack with linear-complexity. To secure an 8-bank 16~MB 16-way LLC, PhantomCache confines randomization space of an address within 8 sets and brings only 0.5% performance degradation and 0.5% storage overhead per cache line, which are 3x and 9x more efficient than the state-of-the-art solutions. Moreover, PhantomCache is solely an architectural solution and requires no software change.
-
Saeid Tizpaz-Niari (University of Colorado Boulder), Pavol Černý (TU Wien), Ashutosh Trivedi (University of Colorado Boulder)
Information leaks through side channels are a pervasive problem, even in security-critical applications. Functional side channels arise when an attacker knows that a secret value of a server stays fixed for a certain time. Then, the attacker can observe the server executions on a sequence of different public inputs, each paired with the same secret input. Thus for each
secret, the attacker observes a (partial) function from public inputs to execution time, for instance, and she can compare these functions for different secrets.First, we introduce a notion of noninterference for functional side channels. We focus on the case of noisy observations, where we demonstrate with examples that there is a practical functional
side channel in programs that would be deemed information-leak-free or be underestimated using the standard definition. Second, we develop a framework and techniques for debugging programs for functional side channels. We extend evolutionary fuzzing techniques to generate inputs that exploit functional dependencies of response times on public inputs. We adapt existing results and algorithms in functional data analysis (such as functional clustering) to model the functions and discover the existence of side channels. We use a functional extension of standard
decision tree learning to pinpoint the code fragments causing a side channel if there is one.We empirically evaluate the performance of our tool FUCHSIA on a series of micro-benchmarks, as well as on realistic Java programs. On the set of micro-benchmark, we show that FUCHSIA
outperforms the state-of-the-art techniques in detecting side channel classes. On the realistic programs, we show the scalability of FUCHSIA in analyzing functional side channels in Java programs with thousands of methods. In addition, we show the usefulness of FUCHSIA in finding (and locating in code) side channels including a zero-day vulnerability in Open Java Development Kit and another Java web server vulnerability that was since fixed by the original developers. -
Shivam Bhasin (Nanyang Technological University), Anupam Chattopadhyay (Nanyang Technological University), Annelie Heuser (Univ Rennes, Inria, CNRS, IRISA), Dirmanto Jap (Nanyang Technological University), Stjepan Picek (Delft University of Technology), Ritu Ranjan Shrivastwa (Secure-IC)
Profiled side-channel attacks represent a practical threat to digital devices, thereby having the potential to disrupt the foundation of e-commerce, Internet-of-Things (IoT), and smart cities. In the profiled side-channel attack, adversary gains knowledge about the target device by getting access to a cloned device. Though these two devices are different in real-world scenarios, yet, unfortunately, a large part of research works simplifies the setting by using only a single device for both profiling and attacking. There, the portability issue is conveniently ignored to ease the experimental procedure. In parallel to the above developments, machine learning techniques are used in recent literature demonstrating excellent performance in profiled side-channel attacks. Again, unfortunately, the portability is neglected.
In this paper, we consider realistic side-channel scenarios and commonly used machine learning techniques to evaluate the influence of portability on the efficacy of an attack. Our experimental results show that portability plays an important role and should not be disregarded as it contributes to a significant overestimate of the attack efficiency, which can easily be an order of magnitude size. After establishing the importance of portability, we propose a new model called the Multiple Device Model (MDM) that formally incorporates the device to device variation during a profiled side-channel attack. We show through experimental studies, how machine learning and MDM significantly enhances the capacity for practical side-channel attacks.
More precisely, we demonstrate how MDM can improve the performance of an attack by an order of magnitude, completely negating the influence of portability.
-
Kyungho Joo (Korea University), Wonsuk Choi (Korea University), Dong Hoon Lee (Korea University)
Recently, the traditional way to unlock car doors has been replaced with a keyless entry system which proves more convenient for automobile owners. When a driver with a key fob is in vicinity of the vehicle, doors automatically unlock on user command. However, unfortunately, it has been known that these keyless entry systems are vulnerable to signal-relaying attacks. While it is evident that automobile manufacturers incorporate preventative methods to secure these keyless entry systems, a range of attacks continue to occur. Relayed signals fit into the valid packets that are verified as legitimate, and this makes it is difficult to distinguish a legitimate request for doors to be unlocked from malicious signals. In response to this vulnerability, this paper presents an RF-fingerprinting method (coined “HOld the DOoR”, HODOR) to detect attacks on keyless entry systems, which is the first attempt to exploit RF-fingerprint technique in automotive domain. HODOR is designed as a sub-authentication system that supports existing authentication systems for keyless entry systems and does not require any modification of the main system to perform. Through a series of experiments, the results demonstrate that HODOR competently and reliably detects attacks on keyless entry systems. HODOR achieves both an average false positive rate (FPR) of 0.27% with a false negative rate (FNR) of 0% for the detection of simulated attacks corresponding to the current issue on keyless entry car theft. Furthermore, HODOR was also observed under environmental factors: temperature variation, non-line-of-sight (NLoS) conditions and battery aging. HODOR yields a false positive rate of 1.32% for the identification of a legitimated key fob which is even under NLoS condition. Based on the experimental results, it is expected that HODOR will provide a secure service for keyless entry systems, while remaining convenient.
-
Menghao Zhang (Tsinghua University), Guanyu Li (Tsinghua University), Shicheng Wang (Tsinghua University), Chang Liu (Tsinghua University), Ang Chen (Rice University), Hongxin Hu (Clemson University), Guofei Gu (Texas A&M University), Qi Li (Tsinghua University), Mingwei Xu (Tsinghua University), Jianping Wu (Tsinghua University)
Distributed Denial-of-Service (DDoS) attacks have become a critical threat to the Internet. Due to the increasing number of vulnerable Internet of Things (IoT) devices, attackers can easily compromise a large set of nodes and launch highvolume DDoS attacks from the botnets. State-of-the-art DDoS defenses, however, have not caught up with the fast development of the attacks. Middlebox-based defenses can achieve high performance with specialized hardware; however, these defenses incur a high cost, and deploying new defenses typically requires a device upgrade. On the other hand, software-based defenses are highly flexible, but software-based packet processing leads to high performance overheads. In this paper, we propose Poseidon, a system that addresses these limitations in today’s DDoS defenses. It leverages emerging programmable switches, which can be reconfigured in the field without additional hardware upgrade. Users of Poseidon can specify their defense strategies in a modular fashion in the form of a set of defense primitives; this can be further customized easily for each network and extended to include new defenses. Poseidon then maps the defense primitives to run on programmable switches—and when necessary, on server software—for effective defense. When attacks change, Poseidon can reconfigure the underlying defense primitives to respond to the new attack patterns. Evaluations using our prototype demonstrate that Poseidon can effectively defend against highvolume attacks, easily support customization of defense strategies, and adapt to dynamic attacks with low overheads.
-
Marcel Kneib (Robert Bosch GmbH), Oleg Schell (Bosch Engineering GmbH), Christopher Huth (Robert Bosch GmbH)
In vehicles, internal Electronic Control Units (ECUs) are increasingly prone to adversarial exploitation over wireless connections due to ongoing digitalization. Controlling an ECU allows an adversary to send messages to the internal vehicle bus and thereby to control various vehicle functions. Access to the Controller Area Network (CAN), the most widely used bus technology, is especially severe as it controls brakes and steering. However, state of the art receivers are not able to identify the sender of a frame. Retrofitting frame authenticity, e.g. through Message Authentication Codes (MACs), is only possible to a limited extent due to reduced bandwidth, low payload and limited computational resources. To address this problem, observation in analog differences of the CAN signal was proposed to determine the actual sender. These prior approaches, which exhibit good identification rates in some cases, require high sample rates and a high computational effort. With EASI we significantly reduce the required resources and at the same time show increased identification rates of 99.98% by having no false positives in a prototype structure and two series production vehicles. In comparison to the most lightweight approach so far, we have reduced the memory footprint and the computational requirements by a factor of 168 and 142, respectively. In addition, we show the feasibility of EASI and thus for the first time that sender identification is realizable using comprehensive signal characteristics on resource-constrained platforms. Due to the lightweight design, we achieved a classification in under 100,$mu$s with a training time of 2.61 seconds. We also showed the ability to adapt the system to incremental changes during operation. Since cost effectiveness is of utmost importance in the automotive industry due to high production volumes, the achieved improvements are significant and necessary to realize sender identification.
-
Sivaramakrishnan Ramanathan (University of Southern California/Information Sciences Institute), Jelena Mirkovic (University of Southern California/Information Sciences Institute), Minlan Yu (Harvard University)
IP address blacklists are a useful source of information about repeat attackers. Such information can be used to prioritize which traffic to divert for deeper inspection (e.g., repeat offender traffic), or which traffic to serve first (e.g., traffic from sources that are not blacklisted). But blacklists also suffer from overspecialization – each list is geared towards a specific purpose – and they may be inaccurate due to misclassification or stale information. We propose BLAG, a system that evaluates and aggregates multiple blacklists feeds, producing a more useful, accurate and timely master blacklist, tailored to the specific customer network. BLAG uses a sample of the legitimate sources of the customer network’s inbound traffic to evaluate the accuracy of each blacklist over regions of address space. It then leverages recommendation systems to select the most accurate information to aggregate into its master blacklist. Finally, BLAG identifies portions of the master blacklist that can be expanded into larger address regions (e.g. /24 prefixes) to uncover more malicious addresses with minimum collateral damage. Our evaluation of 157 blacklists of various attack types and three ground-truth datasets shows that BLAG achieves high specificity up to 99%, improves recall by up to 114 times compared to competing approaches, and detects attacks up to 13.7 days faster, which makes it a promising approach for blacklist generation.
-
Hui Lin (University of Nevada, Reno), Jianing Zhuang (University of Nevada, Reno), Yih-Chun Hu (University of Illinois, Urbana-Champaign), Huayu Zhou (University of Nevada, Reno)
Reconnaissance is critical for adversaries to prepare attacks causing physical damage in industrial control systems (ICS) like smart power grids. Disrupting the reconnaissance is challenging. The state-of-the-art moving target defense (MTD) techniques based on mimicking and simulating system behaviors do not consider the physical infrastructure of power grids and can be easily identified.
To overcome those challenges, we propose physical function virtualization (PFV) that ``hooks'' network interactions with real physical devices and uses them to build lightweight virtual nodes following the actual implementation of network stacks, system invariants, and physical state variations of real devices. On top of PFV, we propose DefRec, a defense mechanism that significantly increases the reconnaissance efforts for adversaries to obtain the knowledge of power grids' cyber-physical infrastructures. By randomizing communications and crafting decoy data for the virtual physical nodes, DefRec can mislead adversaries into designing damage-free attacks. We implement PFV and DefRec in the ONOS network operating system and evaluate them in a cyber-physical testbed, which uses real devices from different vendors and HP physical switches to simulate six power grids. The experiment results show that with negligible overhead, PFV can accurately follow the behavior of real devices. DefRec can significantly delay passive attacks for at least five months and isolate proactive attacks with less than $10^{-30}$ false negatives.
-
Laura Blackstone (Brown University), Seny Kamara (Brown University), Tarik Moataz (Brown University)
Encrypted search algorithms (ESA) are cryptographic algorithms that support search over encrypted data. ESAs can be designed with various primitives including searchable/structured symmetric encryption (SSE/STE) and oblivious RAM (ORAM). Leakage abuse attacks attempt to recover client queries using knowledge of the client's data. An important parameter for any leakage-abuse attack is its known-data rate; that is, the fraction of client data that must be known to the adversary.
In this work, we revisit leakage abuse attacks in several ways. We first highlight the practical limitations and assumptions underlying the often-cited IKK (Islam et al. NDSS '12) and Count (Cash et al., CCS '15) attacks. We then design four new leakage-abuse attacks that rely on much weaker assumptions. Three of these attacks are volumetric in the sense that they only exploit leakage related to document sizes. In particular, this means that they work not only on SSE/STE-based ESAs but also against ORAM-based solutions. We also introduce two volumetric injection attack which use adversarial file additions to recover queries even from ORAM-based solutions. As far as we know, these are the first attacks of their kind.
We evaluated all our attacks empirically and considered many experimental settings including different data collections, query selectivities, known-data rates, query space size and composition. From our experiments, we observed that the only setting that resulted in reasonable recovery rates under practical assumptions was the case of high-selectivity queries with a leakage profile that includes the response identity pattern (i.e., the identifiers of the matching documents) and the volume pattern (i.e., the size of the matching documents). All other attack scenarios either failed or relied on unrealistic assumptions (e.g., very high known-data rates). For this specific setting, we propose several suggestions and countermeasures including the use of schemes like PBS (Kamara et al, CRYPTO '18), VLH/AVLH (Kamara and Moataz, Eurocrypt '19), or the use of padding techniques like the ones recently proposed by Bost and Fouque (Bost and Fouque, IACR ePrint 2017/1060).
-
Weikeng Chen (UC Berkeley), Raluca Ada Popa (UC Berkeley)
File-sharing systems like Dropbox offer insufficient privacy because a compromised server can see the file contents in the clear. Although encryption can hide such contents from the servers, metadata leakage remains significant. The goal of our work is to develop a file-sharing system that hides metadata---including user identities and file access patterns.
Metal is the first file-sharing system that hides such metadata from malicious users and that has a latency of only a few seconds. The core of Metal consists of a new two-server multi-user oblivious RAM (ORAM) scheme, which is secure against malicious users, a metadata hiding access control protocol, and a capability sharing protocol.
Compared with the state-of-the-art malicious-user file-sharing scheme PIR-MCORAM (Maffei et al.'17), which does not hide user identities, Metal hides the user identities and is 500x faster (in terms of amortized latency) or 10^5x faster (in terms of worst-case latency).
-
Thang Hoang (University of South Florida), Jorge Guajardo (Robert Bosch Research and Technology Center), Attila Yavuz (University of South Florida)
Oblivious Random Access Machine (ORAM) allows a client to hide the access pattern and thus, offers a strong level of privacy for data outsourcing. An ideal ORAM scheme is expected to offer desirable properties such as low client bandwidth, low server computation overhead and the ability to compute over encrypted data. S3ORAM (CCS’17), is a very efficient active ORAM scheme, which takes advantage of secret sharing to provide ideal properties for data outsourcing such as low client bandwidth, low server computation and low delay. Despite its merits, S3ORAM only offers security in the semi-honest setting.
In practice, it is likely that an ORAM protocol will have to operate in the presence of malicious adversaries who might deviate from the protocol to compromise the client privacy.
In this paper, we propose MACAO, a new multi-server ORAM framework, which offers integrity, access pattern obliviousness against active adversaries, and the ability to perform secure computation over the accessed data. MACAO harnesses authenticated secret sharing techniques and tree-ORAM paradigm to achieve low client communication, efficient server computation, and low storage overhead at the same time. We fully implemented MACAO and conducted extensive experiments in real cloud platforms (Amazon EC2) to validate the performance of MACAO compared with the state-of-the-art. Our results indicate that MACAO can achieve comparable performance to S3ORAM while offering security against malicious adversaries. Our MACAO is a suitable candidate for integration into distributed file systems with encrypted computation capabilities towards enabling a full-fledged oblivious data outsourcing infrastructure. We will open-source MACAO for broad testing and adaptations.
-
Hamid Mozaffari (University of Massachusetts Amherst), Amir Houmansadr (University of Massachusetts Amherst)
Private information retrieval (PIR) enables clients to query and retrieve data from untrusted servers without the untrusted servers learning which data was retrieved.
In this paper, we present a new class of multi-server PIR protocols, which we call emph{heterogeneous PIR (HPIR)}. In such multi-server PIR protocols, the computation and communication overheads imposed on the PIR servers are non-uniform, i.e., some servers handle higher computation/communication burdens than the others. This enables heterogeneous PIR protocols to be suitable for a range of new PIR applications.
What enables us to enforce such heterogeneity is a unique PIR-tailored secret sharing algorithm that we leverage in building our PIR protocol.
We have implemented our HPIR protocol and evaluated its performance in comparison with regular PIR protocols. Our evaluations demonstrate that a querying client can trade off the computation and communication loads of the (heterogeneous) PIR servers by adjusting some parameters. For example in a two server scenario with a heterogeneity degree of $4/1$, to retrieve a $456$KB file from a $0.2$GB database, the rich (i.e., resourceful) PIR server will do $1.1$ seconds worth of computation compared to $0.3$ seconds by the poor (resource-constrained) PIR server; this is while each of the servers would do the same $1$ seconds of computation in a homogeneous settings. Also, for this given example, our HPIR protocol will impose $912$KB communication bandwidth on the rich server compared to $228$KB on the poor server (by contrast to $456$KB overhead on each of the servers for a traditional homogeneous design).
-
Ioannis Demertzis (University of Maryland), Javad Ghareh Chamani (Hong Kong University of Science and Technology & Sharif University of Technology), Dimitrios Papadopoulos (Hong Kong University of Science and Technology), Charalampos Papamanthou (University of Maryland)
We study the problem of dynamic searchable encryption (DSE) with forward-and-backward privacy. Many DSE schemes have been proposed recently but the most efficient ones have one limitation: they require maintaining an operation counter for each unique keyword, either stored locally at the client or accessed obliviously (e.g., with an oblivious map) at the server, during every operation. We propose three new schemes that overcome the above limitation and achieve constant permanent client storage with improved performance, both asymptotically and experimentally, compared to prior state-of-the-art works. In particular, our first two schemes adopt a "static-to-dynamic" transformation which eliminates the need for oblivious accesses during searches. Due to this, they are the first practical schemes with minimal client storage and non-interactive search. Our third scheme is the first quasi-optimal forward-and-backward DSE scheme with only a logarithmic overhead for retrieving the query result (independently of previous deletions). While it does require an oblivious access during search in order to keep permanent client storage minimal, its practical performance is up to four orders of magnitude better than the best existing scheme with quasi-optimal search.
-
Jared M. Smith (University of Tennessee, Knoxville), Kyle Birkeland (University of Tennessee, Knoxville), Tyler McDaniel (University of Tennessee, Knoxville), Max Schuchard (University of Tennessee, Knoxville)
The security of the Internet's routing infrastructure has underpinned much of the past two decades of distributed systems security research. However, the converse is increasingly true. Routing and path decisions are now important for the security properties of systems built on top of the Internet. In particular, BGP poisoning leverages the de facto routing protocol between Autonomous Systems (ASes) to maneuver the return paths of upstream networks onto previously unusable, new paths. These new paths can be used to avoid congestion, censors, geo-political boundaries, or any feature of the topology which can be expressed at an AS-level. Given the increase in use of BGP poisoning as a security primitive for security systems, we set out to evaluate the feasibility of poisoning in practice, going beyond simulation.
To that end, using a multi-country and multi-router Internet-scale measurement infrastructure, we capture and analyze over 1,400 instances of BGP poisoning across thousands of ASes as a mechanism to maneuver return paths of traffic. We analyze in detail the performance of steering paths, the graph-theoretic aspects of available paths, and re-evaluate simulated systems with this data. We find that the real-world evidence does not completely support the findings from simulated systems published in the literature. We also analyze filtering of BGP poisoning across types of ASes and ISP working groups. We explore the connectivity concerns when poisoning by reproducing a decade old experiment to uncover the current state of an Internet triple the size. We build predictive models for understanding an ASes vulnerability to poisoning. Finally, an exhaustive measurement of an upper bound on the maximum path length of the Internet is presented, detailing how recent and future security research should react to ASes leveraging poisoning with long paths. In total, our results and analysis attempt to expose the real-world impact of BGP poisoning on past and future security research.
-
David Rupprecht (Ruhr University Bochum), Katharina Kohls (Ruhr University Bochum), Thorsten Holz (Ruhr University Bochum), Christina Poepper (NYU Abu Dhabi)
Long Term Evolution (LTE/4G) establishes mutual authentication with a provably secure Authentication and Key Agreement protocol on layer three of the network stack. Permanent integrity protection of the control plane safeguards the traffic against manipulations. However, missing integrity protection of the user plane still allows an adversary to manipulate and redirect IP packets, as recently demonstrated.
In this work, we introduce a novel cross-layer attack that exploits the existing vulnerability on layer two and extends it with an attack mechanism on layer three. More precisely, we take advantage of the default IP stack behavior of operating systems, which allows an active attacker to impersonate a user towards the network and vice versa; we name these attacks IMP4GT (IMPersonation attacks in 4G neTworks). In contrast to a simple redirection attack as demonstrated in prior work, our attack dramatically extends the possible attack scenarios and thus emphasizes the need for user plane integrity protection in mobile communication standards. The results of our work imply that providers can no longer rely on mutual authentication for billing, access control, and legal prosecution. On the other side, users are exposed to any incoming IP connection as an adversary can bypass the provider's firewall. To demonstrate the practical impact of our attack, we conduct two IMP4GT attack variants in a commercial network, which---for the first time---completely break the mutual authentication aim of LTE on the user plane in a real-world setting.
-
Alireza Bahramali (University of Massachusetts Amherst), Amir Houmansadr (University of Massachusetts Amherst), Ramin Soltani (University of Massachusetts Amherst), Dennis Goeckel (University of Massachusetts Amherst), Don Towsley (University of Massachusetts Amherst)
Instant Messaging (IM) applications like Telegram, Signal, and WhatsApp have become extremely popular in recent years. Unfortunately, such IM services have been the target of continuous governmental surveillance and censorship, as these services are home to public and private communication channels on socially and politically sensitive topics. To protect their clients, popular IM services deploy state-of-the-art encryption mechanisms. In this paper, we show that despite the use of advanced encryption, popular IM applications leak sensitive information about their clients to adversaries who merely monitor their encrypted IM traffic, with no need for leveraging any software vulnerabilities of IM applications. Specifically, we devise traffic analysis attacks that enable an adversary to identify administrators as well as members of target IM channels (e.g., forums) with high accuracies. We believe that our study demonstrates a significant, real-world threat to the users of such services given the increasing attempts by oppressive governments at cracking down controversial IM channels.
We demonstrate the practicality of our traffic analysis attacks through extensive experiments on real-world IM communications. We show that standard countermeasure techniques such as adding cover traffic can degrade the effectiveness of the attacks we introduce in this paper. We hope that our study urges IM providers to integrate effective traffic obfuscation countermeasures into their software. In the meantime, we have designed and deployed an open-source, publicly available countermeasure system, called IMProxy, that can be used by IM clients with no need for any support from IM providers. We have demonstrated the effectiveness of IMProxy through experiments.
-
Run Guo (Tsinghua University), Weizhong Li (Tsinghua University), Baojun Liu (Tsinghua University), Shuang Hao (University of Texas at Dallas), Jia Zhang (Tsinghua University), Haixin Duan (Tsinghua University), Kaiwen Sheng (Tsinghua University), Jianjun Chen (ICSI), Ying Liu (Tsinghua University)
Content Delivery Network (CDN) improves the websites' accessing performance and availability with its globally distributed network infrastructures, which contributes to the flourish of CDN-powered websites on the Internet. As CDN-powered websites are normally operating important businesses or critical services, the attackers are mostly interested to take down these high-value websites, achieving severe damage with maximum influence. As the CDN absorbs distributed attacking traffic with its massive bandwidth resources, CDN vendors have always claimed that they provide effective DoS protection for the CDN-powered websites.
However, we reveal that, implementation or protocol weaknesses in the CDN's forwarding mechanism can be exploited to break the CDN protection. By sending crafted but legal requests, an attacker can launch an efficient DoS attack against the website Origin behind.
In particular, we present three CDN threats in this study.
Through abusing the CDN's HTTP/2 request converting behavior and HTTP pre-POST behavior, an attacker can saturate the CDN-Origin bandwidth and exhaust the Origin's connection limits.
What is more concerning is that, some CDN vendors only use a small set of traffic forwarding IPs with lower IP-churning ratio to establish connections with the Origin. This characteristic provides a great opportunity for an attacker to effectively degrade the website's global availability, by just cutting off specific CDN-Origin connections.In this work, we examine the CDN's request-forwarding behaviors across six well-known CDN vendors, and we perform real-world experiments to evaluate the severity of the threats. As the threats are caused by the CDN vendor's poor trade-offs between usability and security, we discuss the possible mitigations, and we receive positive feedback after responsible disclosure to related CDN vendors.
-
Yue Duan (Cornell University), Xuezixiang Li (UC Riverside), Jinghan Wang (UC Riverside), Heng Yin (UC Riverside)
Binary diffing analysis quantitatively measures the differences between two given binaries and produces fine-grained basic block matching. It has been widely used to enable different kinds of critical security analysis. However, all existing program analysis and machine learning based techniques suffer from low accuracy, poor scalability, coarse granularity, or require extensive labeled training data to function. In this paper, we propose an unsupervised program-wide code representation learning technique to solve the problem. We rely on both the code semantic information and the program-wide control flow information to generate block embeddings. Furthermore, we propose a k-hop greedy matching algorithm to find the optimal diffing results using the generated block embeddings. We implement a prototype called DeepBinDiff and evaluate its effectiveness and efficiency with large number of binaries. The results show that our tool could outperform the state-of-the-art binary diffing tools by a large margin for both cross-version and cross-optimization level diffing. A case study for OpenSSL using real-world vulnerabilities further demonstrates the usefulness of our system.
-
Qiushi Wu (University of Minnesota), Yang He (University of Minnesota), Stephen McCamant (University of Minnesota), Kangjie Lu (University of Minnesota)
A bug is a vulnerability if it has security impacts when triggered. Determining the security impacts of a bug is important to both defenders and attackers. Maintainers of large software systems are bombarded with numerous bug reports and proposed patches, with missing or unreliable information about their impact. Determining which few bugs are vulnerabilities is difficult, and bugs that a maintainer believes do not have security impact will be de-prioritized or even ignored. On the other hand, a public report of a bug with a security impact is a powerful first step towards exploitation. Adversaries may exploit such bugs to launch devastating attacks if defenders do not fix them promptly. Common practice is for maintainers to assess the security impacts of bugs manually, but the scaling and reliability challenges of manual analysis lead to missed vulnerabilities.
We propose an automated approach, Sid, to determine the security impacts for a bug given a patch, so that maintainers can effectively prioritize applying the patch to the affected programs. The insight behind Sid is that both the effect of a patch (either submitted or applied) and security-rule violations (e.g., out-of-bound access) can be modeled as constraints that can be automatically solved. Sid incorporates rule comparison, using under-constrained symbolic execution of a patch to determine the security impacts of an un-applied patch. Sid can further automatically classify vulnerabilities based on their security impacts. We have implemented Sid and applied it to bug patches of the Linux kernel and matching CVE-assigned vulnerabilities to evaluate its precision and recall. We optimized Sid to reduce false positives, and our evaluation shows that, from 66K recent commits, Sid detected 227 security bugs with at least 243 security impacts at a 97% precision rate. Critically, 197 of them were not reported as vulnerabilities before, leading to delayed or ignored patching in derivative programs. Even worse, 21 of them are still unpatched in the latest Android kernel. Once exploited, they can cause critical security impacts to Android devices. The evaluation results confirm that Sid's approach is effective and accurate in automatically determining security impacts for a massive stream of bug patches.
Wednesday, 26 February
-
Xueyuan Han (Harvard University), Thomas Pasquier (University of Bristol), Adam Bates (University of Illinois at Urbana-Champaign), James Mickens (Harvard University), Margo Seltzer (University of British Columbia)
Advanced Persistent Threats (APTs) are difficult to detect due to their “low-and-slow” attack patterns and frequent use of zero-day exploits. We present UNICORN, an anomaly-based APT detector that effectively leverages data provenance analysis. From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs. Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without pre-defined attack signatures. Using a graph sketching technique, it summarizes long-running system execution with space efficiency to combat slow-acting attacks that take place over a long time span. UNICORN further improves its detection capability using a novel modeling approach to understand long-term behavior as the system evolves. Our evaluation shows that UNICORN outperforms an existing state-of-the-art APT detection system and detects real-life APT scenarios with high accuracy.
-
Riccardo Paccagnella (University of Illinois at Urbana–Champaign), Pubali Datta (University of Illinois at Urbana–Champaign), Wajih Ul Hassan (University of Illinois at Urbana–Champaign), Adam Bates (University of Illinois at Urbana–Champaign), Christopher W. Fletcher (University of Illinois at Urbana–Champaign), Andrew Miller (University of Illinois at Urbana–Champaign), Dave Tian (Purdue University)
System auditing is a central concern when investigating and responding to security incidents. Unfortunately, attackers regularly engage in anti-forensic activities after a break-in, covering their tracks from the system logs in order to frustrate the efforts of investigators. While a variety of tamper-evident logging solutions have appeared throughout the industry and the literature, these techniques do not meet the operational and scalability requirements of system-layer audit frameworks.
In this work, we introduce Custos, a practical framework for the detection of tampering in system logs. Custos consists of a tamper-evident logging layer and a decentralized auditing protocol. The former enables the verification of log integrity with minimal changes to the underlying logging framework, while the latter enables near real-time detection of log integrity violations within an enterprise-class network. Custos is made practical by the observation that we can decouple the costs of cryptographic log commitments from the act of creating and storing log events, without trading off security, leveraging features of off-the-shelf trusted execution environments. Supporting over one million events per second, we show that Custos' tamper-evident logging protocol is three orders of magnitude (1000×) faster than prior solutions and incurs only between 2% and 7% runtime overhead over insecure logging on intensive workloads. Further, we show that Custos' auditing protocol can detect violations in near real-time even in the presence of a powerful distributed adversary and with minimal (3%) network overhead. Our case study on a real-world APT attack scenario demonstrates that Custos forces anti-forensic attackers into a "lose-lose" situation, where they can either be covert and not tamper with logs (which can be used for forensics), or erase logs but then be detected by Custos.
-
Qi Wang (University of Illinois Urbana-Champaign), Wajih Ul Hassan (University of Illinois Urbana-Champaign), Ding Li (NEC Laboratories America, Inc.), Kangkook Jee (University of Texas at Dallas), Xiao Yu (NEC Laboratories America, Inc.), Kexuan Zou (University Of Illinois Urbana-Champaign), Junghwan Rhee (NEC Laboratories America, Inc.), Zhengzhang Chen (NEC Laboratories America, Inc.), Wei Cheng (NEC Laboratories America,…
To subvert recent advances in perimeter and host security, the attacker community has developed and employed various attack vectors to make a malware much more stealthy than before to penetrate the target system and prolong its presence. The advanced malware, or stealthy malware, impersonates or abuses benign applications and legitimate system tools to minimize its footprints in the target system. One example of such stealthy malware is fileless malware, which resides its malicious logic mostly in the memory of well-trusted processes. It is difficult for traditional detection tools, such as malware scanners, to detect it, as the malware normally does not expose its malicious payload in a file and hides its malicious behaviors among the benign behaviors of the processes.
In this paper, we present PROVDETECTOR, a provenance-based approach for detecting stealthy malware. The intuition behind PROVDETECTOR is that although a stealthy malware may impersonate or abuse a benign process, it still exposes its malicious behaviors in the OS (operating system) level provenance. Based on this intuition, PROVDETECTOR first employs a novel selection algorithm to identify possibly malicious parts in the OS level provenance data of a process. Then, it applies a neural embedding and machine learning pipeline to automatically detect any behavior that deviates significantly from normal behaviors. We evaluate our approach on a large provenance dataset from an enterprise network and demonstrate that it achieves very high detection performance (an average F1 score of 0.974) of stealthy malware. Further, we conduct thorough interpretability studies to understand the internals of the learned machine learning models.
-
Wajih Ul Hassan (University of Illinois Urbana-Champaign), Mohammad A. Noureddine (University of Illinois Urbana-Champaign), Pubali Datta (University of Illinois Urbana-Champaign), Adam Bates (University of Illinois Urbana-Champaign)
Recent advances in causality analysis have enabled investigators to trace multi-stage attacks using whole- system provenance graphs. Based on system-layer audit logs (e.g., syscalls), these approaches omit vital sources of application context (e.g., email addresses, HTTP response codes) that can found in higher layers of the system. Although this information is often essential to understanding attack behaviors, incorporating this evidence into causal analysis engines is difficult due to the semantic gap that exists between system layers.
To address this shortcoming, we propose the notion of universal provenance, which encodes all forensically-relevant causal dependencies regardless of their layer of origin. To transparently realize this vision on commodity systems, we present ωLOG (“Omega Log”), a provenance tracking mechanism that bridges the semantic gap between system and application logging contexts. ωLOG analyzes program binaries to identify and model application-layer logging behaviors, enabling application events to be accurately reconciled with system-layer accesses. ωLOG then intercepts applications’ runtime logging activities and grafts those events onto the system-layer provenance graph, allowing investigators to reason more precisely about the nature of attacks. We demonstrate that ωLOG is widely-applicable to existing software projects and can transparently facilitate execution partitioning of dependency graphs without any training or developer intervention. Evaluation on real-world attack scenarios shows that universal provenance graphs are concise and rich with semantic information as compared to the state-of-the-art, with 12% average runtime overhead.
-
Harsh Chaudhari (Indian Institute of Science, Bangalore), Rahul Rachuri (Aarhus University, Denmark), Ajith Suresh (Indian Institute of Science, Bangalore)
Machine learning has started to be deployed in fields such as healthcare and finance, which involves dealing with a lot of sensitive data. This propelled the need for and growth of privacy-preserving machine learning. We propose an efficient four-party protocol (4PC) that outperforms the state-of-the-art of Gordon et al. (ASIACRYPT 2018) and showcase its applications on three of the most widely-known machine learning algorithms -- Linear Regression, Logistic Regression, and Neural Networks.
We propose an efficient mixed-world framework (Trident) in the offline-online paradigm to switch between the Arithmetic, Boolean, and Garbled worlds. Our framework operates in 4PC honest majority setting over rings and is instantiated in a server-aided setting for machine learning, where the data is secretly shared among the servers. In addition, we propose conversions especially relevant to privacy-preserving machine learning. We outperform the current state-of-the-art ABY3 (for three parties), in terms of both rounds as well as communication complexity.
The highlights of our framework include using a minimal number of expensive circuits overall as compared to ABY3. This can be seen in our technique for truncation, which does not affect the online cost of multiplication and removes the need for any circuits in the offline phase. Our B2A conversion has an improvement of $mathbf{7} times$ in rounds and $mathbf{18} times$ in the communication complexity. In addition to these, all of the special conversions for machine learning, for eg. Secure Comparison, achieve constant round complexity. These massive improvements are primarily due to the advantage of having an additional third honest party available in our setting.
The practicality of our framework is argued through improvements in the benchmarking of the aforementioned algorithms when compared with ABY3. All the protocols are implemented over a 64-bit ring in both LAN and WAN setting. Our improvements go up to $mathbf{187} times$ for the training phase and $mathbf{158} times$ for the prediction phase, considering LAN and WAN together.
-
Jonas Böhler (SAP Security Research), Florian Kerschbaum (University of Waterloo)
In distributed private learning, e.g., data analysis, machine learning, and enterprise benchmarking, it is commonplace for two parties with confidential data sets to compute statistics over their combined data. The median is an important robust statistical method used in enterprise benchmarking, e.g., companies compare typical employee salaries, insurance companies use median life expectancy to adjust insurance premiums, banks compare credit scores of their customers, and financial regulators estimate risks based on loan exposures.
The exact median can be computed securely, however, it leaks information about the private data. To protect the data sets, we securely compute a differentially private median over the joint data set via the exponential mechanism. The exponential mechanism has a runtime linear in the data universe size and efficiently sampling it is non-trivial. Local differential privacy, where each user shares locally perturbed data with an untrusted server, is often used in private learning but does not provide the same utility as the central model, where noise is only applied once by a trusted server.
We present an efficient secure computation of a differentially private median of the union of two large, confidential data sets. Our protocol has a runtime sublinear in the size of the data universe and utility like the central model without a trusted third party. We use dynamic programming with a static, i.e., data-independent, access pattern, achieving low complexity of the secure computation circuit. We provide a comprehensive evaluation with a large real-world data set with a practical runtime of less than 5 seconds for millions of records even with large network delay of 80ms.
-
Honggang Yu (University of Florida), Kaichen Yang (University of Florida), Teng Zhang (University of Central Florida), Yun-Yun Tsai (National Tsing Hua University), Tsung-Yi Ho (National Tsing Hua University), Yier Jin (University of Florida)
Cloud-based Machine Learning as a Service (MLaaS) is gradually gaining acceptance as a reliable solution to various real-life scenarios. These services typically utilize Deep Neural Networks (DNNs) to perform classification and detection tasks and are accessed through Application Programming Interfaces (APIs). Unfortunately, it is possible for an adversary to steal models from cloud-based platforms, even with black-box constraints, by repeatedly querying the public prediction API with malicious inputs. In this paper, we introduce an effective and efficient black-box attack methodology that extracts largescale DNN models from cloud-based platforms with near-perfect performance. In comparison to existing attack methods, we significantly reduce the number of queries required to steal the target model by incorporating several novel algorithms, including active learning, transfer learning, and adversarial attacks. During our experimental evaluations, we validate our proposed model for conducting theft attacks on various commercialized MLaaS platforms including two Microsoft Custom Vision APIs (Microsoft Traffic Recognition API and Microsoft Flower Recognition API), the Face++ Emotion Recognition API, the IBM Watson Visual Recognition API, Google AutoML API, and the Clarifai Safe for Work (NSFW) API. Our results demonstrate that the proposed method can easily reveal/steal large-scale DNN models from these cloud platforms. Further, the proposed attack method can also be used to accurately evaluates the robustness of DNN based MLaaS image classifiers against theft attacks.
-
Arpita Patra (Indian Institute of Science, Bangalore), Ajith Suresh (Indian Institute of Science, Bangalore)
Machine learning tools have illustrated their potential in many significant sectors such as healthcare and finance, to aide in deriving useful inferences. The sensitive and confidential nature of the data, in such sectors, raise natural concerns for the privacy of data. This motivated the area of Privacy-preserving Machine Learning (PPML) where privacy of the data is guaranteed. Typically, ML techniques require large computing power, which leads clients with limited infrastructure to rely on the method of Secure Outsourced Computation (SOC). In SOC setting, the computation is outsourced to a set of specialized and powerful cloud servers and the service is availed on a pay-per-use basis. In this work, we explore PPML techniques in the SOC setting for widely used ML algorithms-- Linear Regression, Logistic Regression, and Neural Networks.
We propose, BLAZE, a blazing fast PPML framework in the three server setting tolerating one malicious corruption over a ring ($Z_{2^ell}$). BLAZE achieves the stronger guarantee of fairness (all honest servers get the output whenever the corrupt server obtains the same). Leveraging an *input-independent* preprocessing phase, BLAZE has a fast input-dependent online phase relying on efficient PPML primitives such as: (i) A dot product protocol for which the communication in the online phase is *independent* of the vector size, the first of its kind in the three server setting; (ii) A method for truncation that shuns evaluating expensive circuit for Ripple Carry Adders (RCA) and achieves a constant round complexity. This improves over the truncation method of ABY3 (Mohassel et al., CCS 2018) that uses RCA and consumes a round complexity that is of the order of the depth of RCA (which is same as the underlying ring size); (iii) Secure Comparison protocol that requires only one round and a communication of $mathbf{3}$ ring elements in the online phase as opposed to the solution of ASTRA (Chaudhari et al., CCSW 2019), which requires three rounds and a communication of $mathbf{6}$ ring elements.
An extensive benchmarking of BLAZE for the aforementioned ML algorithms over a 64-bit ring in both WAN and LAN settings shows massive improvements over ABY3. Concretely, we observe improvements up to $mathbf{333times}$ for Linear Regression, $mathbf{146 times}$ for Logistic Regression and $mathbf{301times}$ for Neural Networks over WAN. Similarly, we show improvements up to $mathbf{2610times}$ for Linear Regression, $mathbf{820times}$ for Logistic Regression and $mathbf{303times}$ for Neural Networks over LAN.
-
Alessandro Mantovani (EURECOM), Simone Aonzo (University of Genoa), Xabier Ugarte-Pedrero (Cisco Systems), Alessio Merlo (University of Genoa), Davide Balzarotti (EURECOM)
An open research problem on malware analysis is how to statically distinguish between packed and non-packed executables. This has an impact on antivirus software and malware analysis systems, which may need to apply different heuristics or to resort to more costly code emulation solutions to deal with the presence of potential packing routines. It can also affect the results of many research studies in which the authors adopt algorithms that are specifically designed for packed or non-packed binaries.
Therefore, a wrong answer to the question emph{``is this executable packed?''} can make the difference between malware evasion and detection. It has long been known that packing and entropy are strongly correlated, often leading to the wrong assumption that a low entropy score implies that an executable is NOT packed. Exceptions to this rule exist, but they have always been considered as one-off cases, with a negligible impact on any large scale experiment. However, if such assumption might have been acceptable in the past, our experiments show that this is not the case anymore as an increasing and remarkable number of packed malware samples implement proper schemes to keep their entropy low. In this paper, we empirically investigate and measure this problem by analyzing a dataset of 50K low-entropy Windows malware samples.
Our tests show that, despite all samples have a low entropy value, over 30% of them adopt some form of runtime packing. We then extended our analysis beyond the pure entropy, by considering all static features that have been proposed so far to identify packed code. Again, our tests show that even a state of the art machine learning classifier is unable to conclude whether a low-entropy sample is packed or not by relying only on features extracted with static analysis.
-
Hojjat Aghakhani (University of California, Santa Barbara), Fabio Gritti (University of California, Santa Barbara), Francesco Mecca (Università degli Studi di Torino), Martina Lindorfer (TU Wien), Stefano Ortolani (Lastline Inc.), Davide Balzarotti (Eurecom), Giovanni Vigna (University of California, Santa Barbara), Christopher Kruegel (University of California, Santa Barbara)
Machine learning techniques are widely used in addition to signatures and heuristics to increase the detection rate of anti-malware software, as they automate the creation of detection models, making it possible to handle an ever-increasing number of new malware samples. In order to foil the analysis of anti-malware systems and evade detection, malware uses packing and other forms of obfuscation. However, few realize that benign applications use packing and obfuscation as well, to protect intellectual property and prevent license abuse.
In this paper, we study how machine learning based on static analysis features operates on packed samples. Malware researchers have often assumed that packing would prevent machine learning techniques from building effective classifiers. However, both industry and academia have published results that show that machine-learning-based classifiers can achieve good detection rates, leading many experts to think that classifiers are simply detecting the fact that a sample is packed, as packing is more prevalent in malicious samples. We show that, different from what is commonly assumed, packers do preserve some information when packing programs that is “useful” for malware classification. However, this information does not necessarily capture the sample’s behavior. We demonstrate that the signals extracted from packed executables are not rich enough for machine-learning-based models to (1) generalize their knowledge to operate on unseen packers, and (2) be robust against adversarial examples. We also show that a naïve application of machine learning techniques results in a substantial number of false positives, which, in turn, might have resulted in incorrect labeling of ground-truth data used in past work.
-
Runqing Yang (Zhejiang University), Shiqing Ma (Rutgers University), Haitao Xu (Arizona State University), Xiangyu Zhang (Purdue University), Yan Chen (Northwestern University)
Existing attack investigation solutions for GUI applications suffer from a few limitations such as inaccuracy (because of the dependence explosion problem), requiring instrumentation, and providing very low visibility. Such limitations have hindered their widespread and practical deployment. In this paper, we present UIScope, a novel accurate, instrumentation-free, and visible attack investigation system for GUI applications. The core idea of UIScope is to perform causality analysis on both UI elements/events which represent users' perspective and low-level system events which provide detailed information of what happens under the hood, and then correlate system events with UI events to provide high accuracy and visibility. Long running processes are partitioned to individual UI transitions, to which low-level system events are attributed, making the results accurate. The produced graphs contain (causally related) UI elements with which users are very familiar, making them easily accessible. We deployed UIScope on 7 machines for a week, and also utilized UIScope to conduct an investigation of 6 real-world attacks. Our evaluation shows that compared to existing works, UIScope introduces negligible overhead (less than 1% runtime overhead and 3.05 MB event logs per hour on average) while UIScope can precisely identify attack provenance while offering users thorough visibility into the attack context.
-
Shiqing Luo (Georgia State University), Anh Nguyen (Georgia State University), Chen Song (San Diego State University), Feng Lin (Zhejiang University), Wenyao Xu (SUNY Buffalo), Zhisheng Yan (Georgia State University)
The increasing popularity of virtual reality (VR) in a wide spectrum of applications has generated sensitive personal data such as medical records and credit card information. While protecting such data from unauthorized access is critical, directly applying traditional authentication methods (e.g., PIN) through new VR input modalities such as remote controllers and head navigation would cause security issues. The authentication action can be purposefully observed by attackers to infer the authentication input. Unlike any other mobile devices, VR presents immersive experience via a head-mounted display (HMD) that fully covers users' eye area without public exposure. Leveraging this feature, we explore human visual system (HVS) as a novel biometric authentication tailored for VR platforms. While previous works used eye globe movement (gaze) to authenticate smartphones or PCs, they suffer from a high error rate and low stability since eye gaze is highly dependent on cognitive states. In this paper, we explore the HVS as a whole to consider not just the eye globe movement but also the eyelid, extraocular muscles, cells, and surrounding nerves in the HVS. Exploring HVS biostructure and unique HVS features triggered by immersive VR content can enhance authentication stability. To this end, we present OcuLock, an HVS-based system for reliable and unobservable VR HMD authentication. OcuLock is empowered by an electrooculography (EOG) based HVS sensing framework and a record-comparison driven authentication scheme. Experiments through 70 subjects show that OcuLock is resistant against common types of attacks such as impersonation attack and statistical attack with Equal Error Rates as low as 3.55% and 4.97% respectively. More importantly, OcuLock maintains a stable performance over a 2-month period and is preferred by users when compared to other potential approaches.
-
Benjamin Zi Hao Zhao (University of New South Wales and Data61 CSIRO), Hassan Jameel Asghar (Macquarie University and Data61 CSIRO), Mohamed Ali Kaafar (Macquarie University and Data61 CSIRO)
We assess the security of machine learning based biometric authentication systems against an attacker who submits uniform random inputs, either as feature vectors or raw inputs, in order to find an emph{accepting sample} of a target user. The average false positive rate (FPR) of the system, i.e., the rate at which an impostor is incorrectly accepted as the legitimate user, may be interpreted as a measure of the success probability of such an attack. However, we show that the success rate is often higher than the FPR. In particular, for one reconstructed biometric system with an average FPR of 0.03, the success rate was as high as 0.78. This has implications for the security of the system, as an attacker with only the knowledge of the length of the feature space can impersonate the user with less than 2 attempts on average. We provide detailed analysis of why the attack is successful, and validate our results using four different biometric modalities and four different machine learning classifiers. Finally, we propose mitigation techniques that render such attacks ineffective, with little to no effect on the accuracy of the system.
-
Zhenfeng Zhang (Chinese Academy of Sciences, University of Chinese Academy of Sciences, and The Joint Academy of Blockchain Innovation), Yuchen Wang (Chinese Academy of Sciences and University of Chinese Academy of Sciences), Kang Yang (State Key Laboratory of Cryptology)
Shared credential is currently the most widespread form of end user authentication with its convenience, but it is also criticized for being vulnerable to credential database theft and phishing attacks. While several alternative mechanisms are proposed to offer strong authentication with cryptographic challenge-response protocols, they are cumbersome to use due to the need of tamper-resistant hardware modules at user end.
In this paper, we propose the first strong authentication mechanism without the reliance on tamper-resistant hardware at user end. A user authenticates with a password-based credential via generating designated-verifiable authentication tokens. Our scheme is resistant to offline dictionary attacks in spite that the attacker can steal the password-protected credentials, and thus can be implemented for general-purpose device.
More specifically, we first introduce and formalize the notion of Password-Based Credential (PBC), which models the resistance of offline attacks and the unforageability of authentication tokens even if attackers can see authentication tokens and capture password-wrapped credentials of honest users. We then present a highly-efficient construction of PBC using a “randomize-then-prove” approach, and prove its security. The construction doesn’t involve bilinear-pairings, and can be implemented with common cryptographic libraries for many platforms. We also present a technique to transform the PBC scheme to be publicly-verifiable, and present an application of PBC in federated identity systems to provide holder-of-key assertion mechanisms. Compared with current certificate-based approaches, it is more convenient and user-friendly, and can be used with the federation systems that employs privacy-preserving measures (e.g., Sign-in with Apple).
We also implement the PBC scheme and evaluate its performance for different applications over various network environment. When PBC is used as a strong authentication mechanism for end users, it saves 26%-36% of time than the approach based on ECDSA with a tamper-resistant hardware module. As for its application in federation, it could even save more time when the user proves its possession of key to a Relying Party.
-
Matthew Smith (University of Oxford), Martin Strohmeier (University of Oxford), Jonathan Harman (Vrije Universiteit Amsterdam), Vincent Lenders (armasuisse Science and Technology), Ivan Martinovic (University of Oxford)
Many wireless communications systems found in aircraft lack standard security mechanisms, leaving them fundamentally vulnerable to attack. With affordable software-defined radios available, a novel threat has emerged, allowing a wide range of attackers to easily interfere with wireless avionic systems. Whilst these vulnerabilities are known, concrete attacks that exploit them are still novel and not yet well understood. This is true in particular with regards to their kinetic impact on the handling of the attacked aircraft and consequently its safety. To investigate this, we invited 30 Airbus A320 type-rated pilots to fly simulator scenarios in which they were subjected to attacks on their avionics. We implement and analyze novel wireless attacks on three safety-related systems: Traffic Collision Avoidance System (TCAS), Ground Proximity Warning System (GPWS) and the Instrument Landing System (ILS). We found that all three analyzed attack scenarios created significant control impact and cost of disruption through turnarounds, avoidance manoeuvres, and diversions. They further increased workload, distrust in the affected system, and in 38% of cases caused the attacked safety system to be switched off entirely. All pilots felt the scenarios were useful, with 93.3% feeling that specific simulator training for wireless attacks could be valuable.
-
Peter Ney (University of Washington), Luis Ceze (University of Washington), Tadayoshi Kohno (University of Washington)
Here, we evaluate the security of a consumer-facing, third party genetic analysis service, called GEDmatch, that specializes in genetic genealogy: a field that uses genetic data to identify relatives. GEDmatch is one of the most prominent third-party genetic genealogy services due to its size (over 1 million genetic data files) and the large role it now plays in criminal investigations. In this work, we focus on security risks particular to genetic genealogy, namely relative matching queries -- the algorithms used to identify genetic relatives -- and the resulting relative predictions. We experimentally demonstrate that GEDmatch is vulnerable to a number of attacks by an adversary that only uploads normally formatted genetic data files and runs relative matching queries. Using a small number of specifically designed files and queries, an attacker can extract a large percentage of the genetic markers from other users; 92% of markers can be extracted with 98% accuracy, including hundreds of medically sensitive markers. We also find that an adversary can construct genetic data files that falsely appear like relatives to other samples in the database; in certain situations, these false relatives can be used to make the de-identification of genetic data more difficult. These vulnerabilities exist because of particular design choices meant to improve functionality. However, our results show how security and the goals of genetic genealogy can come in conflict. We conclude with a discussion of the broader impact of these results to the entire consumer genetic testing community and provide recommendations for genetic genealogy services.
-
Sebastian Roth (CISPA Helmholtz Center for Information Security), Timothy Barron (Stony Brook University), Stefano Calzavara (Università Ca' Foscari Venezia), Nick Nikiforakis (Stony Brook University), Ben Stock (CISPA Helmholtz Center for Information Security)
The Content Security Policy (CSP) mechanism was developed as a mitigation against script injection attacks in 2010. In this paper, we leverage the unique vantage point of the Internet Archive to conduct a historical and longitudinal analysis of how CSP deployment has evolved for a set of 10,000 highly ranked domains. In doing so, we document the long-term struggle site operators face when trying to roll out CSP for content restriction and highlight that even seemingly secure whitelists can be bypassed through expired or typo domains. Next to these new insights, we also shed light on the usage of CSP for other use cases, in particular, TLS enforcement and framing control. Here, we find that CSP can be easily deployed to fit those security scenarios, but both lack wide-spread adoption. Specifically, while the underspecified and thus inconsistently implemented X-Frame-Options header is increasingly used on the Web, CSP's well-specified and secure alternative cannot keep up. To understand the reasons behind this, we run a notification campaign and subsequent survey, concluding that operators have often experienced the complexity of CSP (and given up), utterly unaware of the easy-to-deploy components of CSP. Hence, we find the complexity of secure, yet functional content restriction gives CSP a bad reputation, resulting in operators not leveraging its potential to secure a site against the non-original attack vectors.
-
Peng Wang (Indiana University Bloomington), Xiaojing Liao (Indiana University Bloomington), Yue Qin (Indiana University Bloomington), XiaoFeng Wang (Indiana University Bloomington)
E-commerce miscreants heavily rely on instant messaging (IM) to promote their illicit businesses and coordinate their operations. The threat intelligence provided by IM communication, therefore, becomes invaluable for understanding and mitigating the threats of e-commerce frauds. However, such information is hard to get since it is usually shared only through one-on-one conversations with the criminals. In this paper, we present the first chatbot, called Aubrey, to actively collect such intelligence through autonomous chats with real-world e-commerce miscreants. Our approach leverages the question-driven conversation pattern of small-time workers, who seek from e-commerce fraudsters jobs and/or attack resources, to model the interaction process as a finite state machine, thereby enabling an autonomous conversation. Aubrey successfully chatted with 470 real-world e-commerce miscreants and gathered a large amount of fraud-related artifact, including 40 SIM gateways, 323K fraud phone numbers, and previously-unknown attack toolkits, etc. Further, the conversations reveal the supply chain of e-commerce fraudulent activities on the deep web and the complicated relations (e.g., complicity and reselling) among miscreant roles.
-
Rock Stevens (University of Maryland), Josiah Dykstra (Independent Security Researcher), Wendy Knox Everette (Leviathan Security Group), James Chapman (Independent Security Researcher), Garrett Bladow (Dragos), Alexander Farmer (Independent Security Researcher), Kevin Halliday (University of Maryland), Michelle L. Mazurek (University of Maryland)
Digital security compliance programs and policies serve as powerful tools for protecting organizations' intellectual property, sensitive resources, customers, and employees through mandated security controls. Organizations place a significant emphasis on compliance and often conflate high compliance audit scores with strong security; however, no compliance standard has been systemically evaluated for security concerns that may exist even within fully-compliant organizations. In this study, we describe our approach for auditing three exemplar compliance standards that affect nearly every person within the United States: standards for federal tax information, credit card transactions, and the electric grid. We partner with organizations that use these standards to validate our findings within enterprise environments and provide first-hand narratives describing impact.
We find that when compliance standards are used literally as checklists --- a common occurrence, as confirmed by compliance experts --- their technical controls and processes are not always sufficient. Security concerns can exist even with perfect compliance. We identified 148 issues of varying severity across three standards; our expert partners assessed 49 of these issues and validated that 36 were present in their own environments and 10 could plausibly occur elsewhere. We also discovered that no clearly-defined process exists for reporting security concerns associated with compliance standards; we report on our varying levels of success in responsibly disclosing our findings and influencing revisions to the affected standards. Overall, our results suggest that auditing compliance standards can provide valuable benefits to the security posture of compliant organizations.
-
Trevor Smith (Brigham Young University), Luke Dickenson (Brigham Young University), Kent Seamons (Brigham Young University)
Current revocation strategies have numerous issues that prevent their widespread adoption and use, including scalability, privacy, and new infrastructure requirements. Consequently, revocation is often ignored, leaving clients vulnerable to man-in-the-middle attacks.
This paper presents Let's Revoke, a scalable global revocation strategy that addresses the concerns of current revocation checking. Let's Revoke introduces a new unique identifier to each certificate that serves as an index to a dynamically-sized bit vector containing revocation status information. The bit vector approach enables significantly more efficient revocation checking for both clients and certificate authorities. We compare Let's Revoke to existing revocation schemes and show that it requires less storage and network bandwidth than other systems, including those that only cover a fraction of the global certificate space. We further demonstrate through simulations that Let's Revoke scales linearly up to ten billion certificates, even during mass revocation events.
-
Dimitrios Sikeridis (The University of New Mexico), Panos Kampanakis (Cisco Systems), Michael Devetsikiotis (The University of New Mexico)
The potential development of large-scale quantum computers is raising concerns among IT and security research professionals due to their ability to solve (elliptic curve) discrete logarithm and integer factorization problems in polynomial time. There is, therefore, a threat to public-key cryptography as all the currently used algorithms would be deemed insecure in a post-quantum (PQ) setting. In response, the National Institute of Standards and Technology (NIST) has initiated a process to standardize quantum-resistant crypto algorithms, focusing primarily on their security guarantees. Since PQ algorithms present significant differences over classical ones, their overall assessment should not be performed out-of-context. This work presents a detailed performance evaluation of the NIST signature algorithm candidates and investigates the imposed latency on TLS 1.3 connection establishment under realistic network conditions. In addition, we investigate their impact on the achievable TLS session throughput of a server and analyze the trade-off between lengthier PQ signatures, and computationally heavier PQ cryptographic operations for idle and heavily loaded servers. Our results demonstrate that the adoption of at least two PQ signature algorithms would indeed be viable for time-sensitive applications over TLS with little additional overhead over current signature algorithms. Also, we argue that more of the NIST PQ candidates can effectively be used for less time-sensitive applications, and provide an in-depth discussion on the integration of PQ authentication in encrypted tunneling protocols, along with the related challenges, and alternatives. Finally, we propose and evaluate the combination of different PQ signature algorithms across the same certificate chain in TLS. Results show a reduction of the TLS handshake time and a significant increase of a server's TLS tunnel connection rate over the alternative of the chain using a single PQ signature scheme.
-
Tomas Hlavacek (Fraunhofer SIT), Italo Cunha (Universidade Federal de Minas Gerais), Yossi Gilad (Hebrew University of Jerusalem), Amir Herzberg (University of Connecticut), Ethan Katz-Bassett (Columbia University), Michael Schapira (Hebrew University of Jerusalem), Haya Shulman (Fraunhofer SIT)
BGP is a gaping security hole in today's Internet, as evidenced by numerous Internet outages and blackouts, repeated traffic hijacking, and surveillance incidents. Yet, despite Herculean efforts, ubiquitous deployment of the Resource Public Key Infrastructure (RPKI), designed to protect against prefix hijacking attacks, remains distant, due to RPKI's manual and error-prone certification process. We argue that deploying origin authentication at scale requires substituting the standard requirement of certifying legal ownership of IP address blocks with the goal of certifying de facto ownership. We show that settling for de facto ownership is sufficient for protecting against hazardous prefix hijacking and can be accomplished without requiring any changes to today's routing infrastructure. We present APKI, a readily deployable system that automatically certifies de facto ownership and generates the appropriate BGP-path-filtering rules at routers. We evaluate APKI's security and deployability via live experiments on the Internet using a prototype implementation of APKI and through simulations on empirically-derived datasets. To facilitate the reproducibility of our results, we open source our prototype, simulator, and measurement analysis code.
-
Giuseppe Ateniese (Stevens Institute of Technology), Long Chen (New Jersey Institute of Technology), Mohammard Etemad (Stevens Institute of Technology), Qiang Tang (New Jersey Institute of Technology)
A high-quality outsourced storage service is crucial for many existing applications. For example, hospitals and data centers need to guarantee the availability of their systems to perform routine daily activities. Such a system should protect users against downtime and ensure data availability over time. Continuous data availability is a critical property to measure the quality of an outsourced storage service, which implies that outsourced data is continuously available to the server during the entire storage period. We formally study the Proof of Storage-Time (PoSt), the notion initially proposed in the Filecoin whitepaper, which enables a verifier to audit the continuous data availability of an outsourced storage service. We provide a formal security model of PoSt and generic constructions that are proven secure under our definition. Moreover, our concrete instantiation can yield a PoSt protocol with an extremely efficient verification: a single hash computation to verify a proof of size around 200 bits. This makes our scheme applicable even in the decentralized storage marketplace enabled by blockchain.
-
Yuan Xiao (The Ohio State University), Yinqian Zhang (The Ohio State University), Radu Teodorescu (The Ohio State University)
SPEculative Execution side Channel Hardware (SPEECH) Vulnerabilities have enabled the notorious Meltdown, Spectre, and L1 terminal fault (L1TF) attacks. While a number of studies have reported different variants of SPEECH vulnerabilities, they are still not well understood. This is primarily due to the lack of information about microprocessor implementation details that impact the timing and order of various micro-architectural events. Moreover, to date, there is no systematic approach to quantitatively measure SPEECH vulnerabilities on commodity processors.
This paper introduces SPEECHMINER, a software framework for exploring and measuring SPEECH vulnerabilities in an automated manner. SPEECHMINER empirically establishes the link between a novel two-phase fault handling model and the exploitability and speculation windows of SPEECH vulnerabilities. It enables testing of a comprehensive list of exception-triggering instructions under the same software framework, which leverages covert-channel techniques and differential tests to gain visibility into the micro-architectural state changes. We evaluated SPEECHMINER on 9 different processor types, examined 21 potential vulnerability variants, confirmed various known attacks, and identified several new variants.
-
Aritra Dhar (ETH Zurich), Enis Ulqinaku (ETH Zurich), Kari Kostiainen (ETH Zurich), Srdjan Capkun (ETH Zurich)
Security and safety-critical remote applications such as e-voting, online banking, industrial control systems and medical devices rely upon user interaction that is typically performed through web applications. Trusted path to such remote systems is critical in the presence of an attacker that controls the computer that the user operates. Such an attacker can observe and modify any IO data without being detected by the user or the server. We investigate the security of previous research proposals and observe several drawbacks that make them vulnerable to attacks. Based on these observations we identify novel requirements for secure IO operation in the presence of a compromised host.
As a solution, we propose ProtectIOn, a system that ensures IO integrity using a trusted low-TCB device that sits between the attacker-controlled host and the IO devices. ProtectIOn intercepts the display signal and user inputs from the keyboard and mouse, and overlays secure UI on top of the HDMI frames generated by the untrusted host. The guiding design principles of ProtectIOn are that (i) integrity of user input and output cannot be considered separately, (ii) all user input modalities need to be protected simultaneously, and (iii) integrity protection should not rely on error prone user tasks like checking the presence of security indicators. By following these guidelines, ProtectIOn achieves strong protection for IO integrity. We also propose an extension of ProtectIOn for IO confidentiality and implement a plug-and-play prototype and evaluate its performance.
-
Michael Schwarz (Graz University of Technology), Moritz Lipp (Graz University of Technology), Claudio Canella (Graz University of Technology), Robert Schilling (Graz University of Technology and Know-Center GmbH), Florian Kargl (Graz University of Technology), Daniel Gruss (Graz University of Technology)
Out-of-order execution and speculative execution are among the biggest contributors to performance and efficiency of modern processors. However, they are inconsiderate, leaking secret data during the transient execution of instructions. Many solutions and hardware fixes have been proposed for mitigating transient-execution attacks. However, they either do not eliminate the leakage entirely or introduce unacceptable performance penalties.
In this paper, we propose ConTExT, a Considerate Transient Execution Technique. ConTExT is a minimal and fully backwards compatible architecture change. The basic idea of ConTExT is that secrets can enter registers, but not transiently leave them. ConTExT transforms Spectre from a problem that cannot be solved purely in software, to a problem that is not easy to solve, but solvable in software. For this, ConTExT requires minimal modifications of applications, compilers, operating systems, and the hardware. ConTExT offers full protection for secrets in memory and secrets in registers. With ConTExT-light, we propose a software-only solution of ConTExT for existing commodity CPUs protecting secrets in memory. We evaluate the security and performance of ConTExT. Even when over-approximating, we observe no performance overhead for unprotected code and data, and an overhead of 71.14% for security-critical applications, which is below the overhead of currently recommended state-of-the-art mitigation strategies.
-
Yang Zhang (CISPA Helmholtz Center for Information Security), Mathias Humbert (armasuisse Science and Technology), Bartlomiej Surma (CISPA Helmholtz Center for Information Security), Praveen Manoharan (CISPA Helmholtz Center for Information Security), Jilles Vreeken (CISPA Helmholtz Center for Information Security), Michael Backes (CISPA Helmholtz Center for Information Security)
Social graphs derived from online social interactions contain a wealth of information that is nowadays extensively used by both industry and academia. However, as social graphs contain sensitive information, they need to be properly anonymized before release. Most of the existing graph anonymization mechanisms rely on the perturbation of the original graph’s edge set. In this paper, we identify a fundamental weakness of these mechanisms: They neglect the strong structural proximity between friends in social graphs, thus add implausible fake edges for anonymization.
To exploit this weakness, we first propose a metric to quantify an edge’s plausibility by relying on graph embedding. Extensive experiments on three real-life social network datasets demonstrate that our plausibility metric can very effectively differentiate fake edges from original edges with AUC values above 0.95 in most of the cases. We then rely on a Gaussian mixture model to automatically derive the threshold on the edge plausibility values to determine whether an edge is fake, which enables us to recover to a large extent the original graph from the anonymized graph. Then, we demonstrate that our graph recovery attack jeopardizes the privacy guarantees provided by the considered graph anonymization mechanisms.
To mitigate this vulnerability, we propose a method to generate fake yet plausible edges given the graph structure and incorporate it into the existing anonymization mechanisms. Our evaluation demonstrates that the enhanced mechanisms decrease the chances of graph recovery, reduce the success of graph de-anonymization (up to 30%), and provide even better utility than the existing anonymization mechanisms. -
Jairo Giraldo (University of Utah), Alvaro Cardenas (UC Santa Cruz), Murat Kantarcioglu (UT Dallas), Jonathan Katz (George Mason University)
Differential Privacy has emerged in the last decade as a powerful tool to protect sensitive information. Similarly, the last decade has seen a growing interest in adversarial classification, where an attacker knows a classifier is trying to detect anomalies and the adversary attempts to design examples meant to mislead this classification.
Differential privacy and adversarial classification have been studied separately in the past. In this paper, we study the problem of how a strategic attacker can leverage differential privacy to inject false data in a system, and then we propose countermeasures against these novel attacks. We show the impact of our attacks and defenses in a real-world traffic estimation system and in a smart metering system.
-
Tianhao Wang (Purdue University), Milan Lopuhaä-Zwakenberg (Eindhoven University of Technology), Zitao Li (Purdue University), Boris Skoric (Eindhoven University of Technology), Ninghui Li (Purdue University)
Local Differential Privacy (LDP) protects user privacy from the data collector. LDP protocols have been increasingly deployed in the industry. A basic building block is frequency oracle (FO) protocols, which estimate frequencies of values. While several FO protocols have been proposed, the design goal does not lead to optimal results for answering many queries. In this paper, we show that adding post-processing steps to FO protocols by exploiting the knowledge that all individual frequencies should be non-negative and they sum up to one can lead to significantly better accuracy for a wide range of tasks, including frequencies of individual values, frequencies of the most frequent values, and frequencies of subsets of values. We consider 10 different methods that exploit this knowledge differently. We establish theoretical relationships between some of them and conducted extensive experimental evaluations to understand which methods should be used for different query tasks.
-
Ren Ding (Georgia Institute of Technology), Hong Hu (Georgia Institute of Technology), Wen Xu (Georgia Institute of Technology), Taesoo Kim (Georgia Institute of Technology)
Software vendors collect crash reports from end-users to assist debugging and testing of their products. However, crash reports may contain user’s private information, like names and passwords, rendering users hesitated to share the crash report with developers. We need a mechanism to protect user’s privacy from crash reports on the client-side, and meanwhile, keep sufficient information to support server-side debugging.
In this paper, we propose the DESENSITIZATION technique that generates privacy-aware and attack-preserving crash reports from crashed processes. Our tool uses lightweight methods to identify bug- and attack-related data from the memory, and removes other data to protect user’s privacy. Since the desensitized memory has more null bytes, we store crash reports in spare files to save the network bandwidth and the server-side storage. We prototype DESENSITIZATION and apply it to a large number of crashes from several real-world programs, like browser and JavaScript engine. The result shows that our DESENSITIZATION technique can eliminate 80.9% of non-zero bytes from coredumps, and 49.0% from minidumps. The desensitized crash report can be 50.5% smaller than the original size, which significantly saves resources for report submission and storage. Our DESENSITIZATION technique is a push-button solution for the privacy-aware crash report.