NDSS Symposium 2019 Programme
Sunday, 24 February
Monday, 25 February
-
Cyber defense has evolved to more than just a physical/electronic approach to keeping information and infrastructure safe from harm’s way. How will we meet the challenge as technological advancement creates a world where an adversary has more opportunity to break into our framework of order?
Biography
Dr. Deborah Frincke leads the Research Directorate of the National Security Agency (NSA). Arguably, the largest in-house research organization in the U.S. Intelligence Community, under her direction, Research creates breakthroughs in cybersecurity, mathematics, physical science, analytics, and other disciplines; these are a critical part of positioning NSA to meet current and future challenges. As Research Director, Dr. Frincke also serves as the NSA Science Advisor and the NSA Innovation Champion. Prior to becoming the Research Director, Dr. Frincke spent a year as Commandant of the National Cryptologic School and NSA Training Director, where she stablished the first NSA Cyber College and launched the GenCyber Program. Prior to that assignment, Dr. Frincke was Research Deputy Director. Before joining NSA in 2011, Dr. Frincke had a threefold career encompassing academia (reaching the rank of Full Professor at University of Idaho), Chief Scientist Cybersecurity for the Department of Energy’s Pacific Northwest National Laboratory, and launched a successful cybersecurity startup company. Dr. Frincke earned her doctoral and master’s degrees in computer science from the University of California, Davis. She is a Senior Member of the Institute of Electrical and Electronics Engineers (IEEE) and has received numerous awards, including the Meritorious Presidential Rank Award in 2016.
-
Ferdinand Brasser (Technische Universität Darmstadt), David Gens (Technische Universität Darmstadt), Patrick Jauernig (Technische Universität Darmstadt), Ahmad-Reza Sadeghi (Technische Universität Darmstadt), Emmanuel Stapf (Technische Universität Darmstadt)
ARM TrustZone is one of the most widely deployed security architecture providing Trusted Execution Environments (TEEs). Unfortunately, its usage and potential benefits for application developers and end users are largely limited due to restricted deployment policies imposed by device vendors. Restriction is enforced since every Trusted App (TA) increases the TEE's attack surface: any vulnerable or malicious TA can compromise the system's security. Hence, deploying a TA requires mutual trust between device vendor and application developer, incurring high costs for both. Vendors work around this by offering interfaces to selected TEE functionalities, however, these are not sufficient to securely implement advanced mobile services like banking. Extensive discussion of Intel's SGX technology in academia and industry has unveiled the demand for an unrestricted use of TEEs, yet no comparable security architecture for mobile devices exists to this day.
We propose SANCTUARY, the first security architecture which allows unconstrained use of TEEs in the TrustZone ecosystem. SANCTUARY enables execution of security-sensitive apps within strongly isolated compartments in TrustZone's normal world comparable to SGX's user-space enclaves. In particular, we leverage TrustZone's versatile Address-Space Controller available in current ARM System-on-Chip reference designs, to enforce two-way hardware-level isolation: (i) security-sensitive apps are shielded against a compromised normal-world OS, while (ii) the system is also protected from potentially malicious apps in isolated compartments. Moreover, moving security-sensitive apps from the TrustZone's secure world to isolated compartments minimizes the TEE's attack surface. Thus, mutual trust relationships between device vendors and developers become obsolete: the full potential of TEEs can be leveraged.
We demonstrate practicality and real-world benefits of SANCTUARY by thoroughly evaluating our prototype on a HiKey 960 development board with microbenchmarks and a use case for one-time password generation in two-factor authentication.
-
Min Hong Yun (Rice University), Lin Zhong (Rice University)
Many mobile and embedded apps possess sensitive data, or secrets. Trusting the operating system (OS), they often keep their secrets in the memory. Recent incidents have shown that the memory is not necessarily secure because the OS can be compromised due to inevitable vulnerabilities resulting from its sheer size and complexity. Existing solutions protect sensitive data against an untrusted OS by running app logic in the Secure world, a Trusted Execution Environment (TEE) supported by the ARM TrustZone technology. Because app logic increases the attack surface of their TEE, these solutions do not work for third-party apps.
This work aims to support third-party apps without growing the attack surface, significant development effort, or performance overhead. Our solution, called Ginseng, protects sensitive data by allocating them to registers at compile time and encrypting them at runtime before they enter the memory, due to function calls, exceptions or lack of physical registers. Ginseng does not run any app logic in the TEE and only requires minor markups to support existing apps. We report a prototype implementation based on LLVM, ARM Trusted Firmware (ATF), and the HiKey board. We evaluate it with both microbenchmarks and real-world secret-holding apps.
Our evaluation shows Ginseng efficiently protects sensitive data with low engineering effort. For example, a Ginseng-enabled web server, Nginx, protects the TLS master key with no measurable overhead. We find Ginseng's overhead is proportional to how often sensitive data in registers have to be encrypted and decrypted, i.e., spilling and restoring sensitive data on a function call or under high register pressure. As a result, Ginseng is most suited to protecting small sensitive data, like a password or social security number.
-
Marius Steffens (CISPA Helmholtz Center for Information Security), Christian Rossow (CISPA Helmholtz Center for Information Security), Martin Johns (TU Braunschweig), Ben Stock (CISPA Helmholtz Center for Information Security)
The Web has become highly interactive and an important driver for modern life, enabling information retrieval, social exchange, and online shopping. From the security perspective, Cross-Site Scripting (XSS) is one of the most nefarious attacks against Web clients. Research has long since focused on three categories of XSS: Reflected, Persistent, and DOM-based XSS. In this paper, we argue that our community must consider at least four important classes of XSS, and present the first systematic study of the threat of Persistent Client-Side XSS, caused by the insecure use of client-side storage. While the existence of this class has been acknowledged, especially by the non-academic community like OWASP, prior works have either only found such flaws as side effects of other analyses or focused on a limited set of applications to analyze. Therefore, the community lacks in-depth knowledge about the actual prevalence of Persistent Client-Side XSS in the wild.
To close this research gap, we leverage taint tracking to identify suspicious flows from client-side persistent storage (Web Storage, cookies) to dangerous sinks (HTML, JavaScript, and script.src).
We discuss two attacker models capable of injecting malicious payloads into storage, i.e., a Network Attacker capable of *temporarily* hijacking HTTP communication (e.g., in a public WiFi), and a Web Attacker who can leverage flows into storage or an existing reflected XSS flaw to persist their payload. With our taint-aware browser and these models in mind, we study the prevalence of Persistent Client-Side XSS in the Alexa Top 5,000 domains.
We find that more than 8% of them have unfiltered data flows from persistent storage to a dangerous sink, which showcases the developers' inherent trust in the integrity of storage content. Even worse, if we only consider sites that make use of data originating from storage, 21% of the sites are vulnerable. For those sites with vulnerable flows from storage to sink, we find that at least 70% are directly exploitable by our attacker models. Finally, investigating the vulnerable flows originating from storage allows us to categorize them into four disjoint categories and propose appropriate mitigations. -
Panagiotis Papadopoulos (FORTH-ICS, Greece), Panagiotis Ilia (FORTH-ICS), Michalis Polychronakis (Stony Brook University, USA), Evangelos P. Markatos (FORTH-ICS, Greece), Sotiris Ioannidis (FORTH-ICS, Greece), Giorgos Vasiliadis (FORTH-ICS, Greece)
The proliferation of web applications has essentially transformed modern browsers into small but powerful operating systems. Upon visiting a website, user devices run implicitly trusted script code, the execution of which is confined within the browser to prevent any interference with the user’s system. Recent JavaScript APIs, however, provide advanced capabilities that not only enable feature-rich web applications, but also allow attackers to perform malicious operations despite the confined nature of JavaScript code execution.
In this paper, we demonstrate the powerful capabilities that modern browser APIs provide to attackers by presenting MarioNet: a framework that allows a remote malicious entity to control a visitor’s browser and abuse its resources for unwanted computation or harmful operations, such as cryptocurrency mining, password-cracking, and DDoS. MarioNet relies solely on already available HTML5 APIs, without requiring the installation of any additional software. In contrast to previous browser- based botnets, the persistence and stealthiness characteristics of MarioNet allow the malicious computations to continue in the background of the browser even after the user closes the window or tab of the initially visited malicious website. We present the design, implementation, and evaluation of our prototype system, which is compatible with all major browsers, and discuss potential defense strategies to counter the threat of such persistent in- browser attacks. Our main goal is to raise awareness about this new class of attacks, and inform the design of future browser APIs so that they provide a more secure client-side environment for web applications.
-
Abdallah Dawoud (CISPA Helmholtz Center i.G.), Sven Bugiel (CISPA Helmholtz Center i.G.)
We present DroidCap, a retrofitting of Android’s central Binder IPC mechanism to change the way how permissions are being represented and managed in the system. In DroidCap, permissions are per-process Binder object capabilities. DroidCap's design removes Android’s UID-based ambient authority and allows the delegation of capabilities between processes to create least-privileged protection domains efficiently. With DroidCap, we show that object capabilities as underlying access control model integrates naturally and backward-compatible into Android’s stock permission model and application management. Thus, our Binder capabilities provide app developers with a new path to gradually adopting app compartmentalization, which we showcase at two favorite examples from the literature, privilege separated advertisement libraries and least privileged app components.
-
Meng Luo (Stony Brook University), Pierre Laperdrix (Stony Brook University), Nima Honarmand (Stony Brook University), Nick Nikiforakis (Stony Brook University)
Recent market share statistics show that mobile device traffic has overtaken
that of traditional desktop computers. Users spend an increasing amount of time
on their smartphones and tablets, while the web continues to be the platform
of choice for delivering new applications to users. In this environment, it
is necessary for web applications to utilize all the tools at their disposal
to protect mobile users against popular web application attacks.
In this paper, we perform the first study of the support of popular
web-application security mechanisms (such as the Content-Security
Policy, HTTP Strict Transport Security, and Referrer Policy) across
mobile browsers. We design 395 individual tests covering 8
different security mechanisms, and utilize them to evaluate the
security-mechanism support in the 20 most popular browser families on
Android. Moreover, by collecting and testing browser versions from the
last seven years, we evaluate a total of 351 unique browser versions
against the aforementioned tests, collecting more than 138K test
results.By analyzing these results, we find that, although mobile browsers
generally support more security mechanisms over time, not all browsers
evolve in the same way. We discover popular browsers, with millions
of downloads, which do not support the majority of the tested
mechanisms, and identify design choices, followed by the majority of
browsers, which leave hundreds of popular websites open to
clickjacking attacks. Moreover, we discover the presence of multi-year
vulnerability windows between the time when popular websites start
utilizing a security mechanism and when mobile browsers enforce it.
Our findings highlight the need for continuous security testing of
mobile web browsers, as well as server-side frameworks which can adapt
to the level of security that each browser can guarantee. -
Qingchuan Zhao (The Ohio State University), Chaoshun Zuo (The Ohio State University), Giancarlo Pellegrino (CISPA, Saarland University; Stanford University), Zhiqiang Lin (The Ohio State University)
Increasingly, mobile application-based ride-hailing services have become a very popular means of transportation. Due to the handling of business logic, these services also contain a wealth of privacy-sensitive information such as GPS locations, car plates, driver licenses, and payment data. Unlike many of the mobile applications in which there is only one type of users, ride-hailing services face two types of users: riders and drivers. While most of the efforts had focused on the rider's privacy, unfortunately, we notice little has been done to protect drivers. To raise the awareness of the privacy issues with drivers, in this paper we perform the first systematic study of the drivers' sensitive data leakage in ride-hailing services. More specifically, we select $20$ popular ride-hailing apps including Uber and Lyft and focus on one particular feature, namely the nearby cars feature. Surprisingly, our experimental results show that large-scale data harvesting of drivers is possible for all of the ride-hailing services we studied. In particular, attackers can determine with high-precision the driver's privacy-sensitive information including mostly visited address (e.g., home) and daily driving behaviors. Meanwhile, attackers can also infer sensitive information about the business operations and performances of ride-hailing services such as the number of rides, utilization of cars, and presence on the territory. In addition to presenting the attacks, we also shed light on the countermeasures the service providers could take to protect the driver's sensitive information.
-
Kostas Drakonakis (FORTH, Greece), Panagiotis Ilia (FORTH, Greece), Sotiris Ioannidis (FORTH, Greece), Jason Polakis (University of Illinois at Chicago, USA)
The exposure of location data constitutes a significant privacy risk to users as it can lead to de-anonymization, the inference of sensitive information, and even physical threats. In this paper we present LPAuditor, a tool that conducts a comprehensive evaluation of the privacy loss caused by public location metadata. First, we demonstrate how our system can pinpoint users’ key locations at an unprecedented granularity by identifying their actual postal addresses. Our evaluation on Twitter data highlights the effectiveness of our techniques which outperform prior approaches by 18.9%-91.6% for homes and 8.7%-21.8% for workplaces. Next we present a novel exploration of automated private information inference that uncovers “sensitive” locations that users have visited (pertaining to health, religion, and sex/nightlife). We find that location metadata can provide additional context to tweets and thus lead to the exposure of private information that might not match the users’ intentions.
We further explore the mismatch between user actions and information exposure and find that older versions of the official Twitter apps follow a privacy-invasive policy of including precise GPS coordinates in the metadata of tweets that users have geotagged at a coarse-grained level (e.g., city). The implications of this exposure are further exacerbated by our finding that users are considerably privacy-cautious in regards to exposing precise location data. When users can explicitly select what location data is published, there is a 94.6% reduction in tweets with GPS coordinates. As part of current efforts to give users more control over their data, LPAuditor can be adopted by major services and offered as an auditing tool that informs users about sensitive information they (indirectly) expose through location metadata.
-
Victor Le Pochat (imec-DistriNet, KU Leuven), Tom Van Goethem (imec-DistriNet, KU Leuven), Samaneh Tajalizadehkhoob (Delft University of Technology), Maciej Korczyński (Grenoble Alps University), Wouter Joosen (imec-DistriNet, KU Leuven)
In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide Tranco, an improved ranking that we offer through an online service available at https://tranco-list.eu.
-
Michael Schwarz (Graz University of Technology), Florian Lackner (Graz University of Technology), Daniel Gruss (Graz University of Technology)
Today, more and more web browsers and extensions provide anonymity features to hide user details. Primarily used to evade tracking by websites and advertisements, these features are also used by criminals to prevent identification. Thus, not only tracking companies but also law-enforcement agencies have an interest in finding flaws which break these anonymity features. For instance, for targeted exploitation using zero days, it is essential to have as much information about the target as possible. A failed exploitation attempt, e.g., due to a wrongly guessed operating system, can burn the zero-day, effectively costing the attacker money. Also for side-channel attacks, it is of the utmost importance to know certain aspects of the victim's hardware configuration, e.g., the instruction-set architecture. Moreover, knowledge about specific environmental properties, such as the operating system, allows crafting more plausible dialogues for phishing attacks.
In this paper, we present a fully automated approach to find subtle differences in browser engines caused by the environment. Furthermore, we present two new side-channel attacks on browser engines to detect the instruction-set architecture and the used memory allocator. Using these differences, we can deduce information about the system, both about the software as well as the hardware. As a result, we cannot only ease the creation of fingerprints, but we gain the advantage of having a more precise picture for targeted exploitation. Our approach allows automating the cumbersome manual search for such differences. We collect all data available to the JavaScript engine and build templates from these properties. If a property of such a template stays the same on one system but differs on a different system, we found an environment-dependent property.
We found environment-dependent properties in Firefox, Chrome, Edge, and mobile Tor, allowing us to reveal the underlying operating system, CPU architecture, used privacy-enhancing plugins, as well as exact browser version. We stress that our method should be used in the development of browsers and privacy extensions to automatically find flaws in the implementation.
-
Alexander Sjösten (Chalmers University of Technology), Steven Van Acker (Chalmers University of Technology), Pablo Picazo-Sanchez (Chalmers University of Technology), Andrei Sabelfeld (Chalmers University of Technology)
Browser extensions enable rich experience for the users of today's web. Being
deployed with elevated privileges, extensions are given the power to overrule
web pages. As a result, web pages often seek to detect the installed extensions,
sometimes for benign adoption of their behavior but sometimes as part of
privacy-violating user fingerprinting.
Researchers have studied a class of attacks that allow detecting extensions by
probing for Web Accessible Resources (WARs) via URLs that include public
extension IDs.
Realizing privacy risks associated with WARs, Firefox has recently moved to
randomize a browser extension's ID, prompting the Chrome team to plan for
following the same path.
However, rather than mitigating the issue, the randomized IDs can in fact
exacerbate the extension detection problem, enabling attackers to use a
randomized ID as a reliable fingerprint of a user.
We study a class of extension revelation attacks, where extensions reveal
themselves by injecting their code on web pages.
We demonstrate how a combination of revelation and probing can uniquely identify
90% out of all extensions injecting content, in spite of a randomization scheme.
We perform a series of large-scale studies to estimate possible implications of
both classes of attacks.
As a countermeasure, we propose a browser-based mechanism that enables control
over which extensions are loaded on which web pages and present a proof of
concept implementation which blocks both classes of attacks. -
Hyunwoo Lee (Seoul National University), Zach Smith (University of Luxembourg), Junghwan Lim (Seoul National University), Gyeongjae Choi (Seoul National University), Selin Chun (Seoul National University), Taejoong Chung (Rochester Institute of Technology), Ted "Taekyoung" Kwon (Seoul National University)
Middleboxes (MBs) are widely deployed in order to enhance security and performance in networking.
However, as the communications over the TLS become increasingly common, the end-to-end channel model of the TLS undermines the efficacy of MBs.
Existing solutions, such as `split TLS' that intercepts TLS sessions, often introduce significant security risks by installing a custom root certificate or sharing a private key.
Many studies have confirmed the vulnerabilities of combining the TLS with MBs, which include certificate validation failures, unwanted content modification, and using obsolete ciphersuites.
To address the above issues, we introduce an MB-aware TLS protocol, dubbed maTLS, that allows MBs to participate in the TLS in a visible and accountable fashion.
Every participating MB now splits a session into two segments with its own security parameters in collaboration with the two endpoints.
However, the session is still secure as the maTLS protocol is designed to achieve the authentication of MBs, the audit of MBs' operations, and the verification of security parameters of segments.
We carry out testbed-based experiments to show that maTLS achieves the above security goals with marginal overhead.
We also prove the security model of maTLS by using Tamarin, a security verification tool.
-
Maria Apostolaki (ETH Zurich), Gian Marti (ETH Zurich), Jan Müller (ETH Zurich), Laurent Vanbever (ETH Zurich)
Nowadays Internet routing attacks remain practi- cally effective as existing countermeasures either fail to provide protection guarantees or are not easily deployable. Blockchain systems are particularly vulnerable to such attacks as they rely on Internet-wide communications to reach consensus. In particular, Bitcoin—the most widely-used cryptocurrency—can be split in half by any AS-level adversary using BGP hijacking.
In this paper, we present SABRE, a secure and scalable Bitcoin relay network which relays blocks worldwide through a set of connections that are resilient to routing attacks. SABRE runs alongside the existing peer-to-peer network and is easily deployable. As a critical system, SABRE design is highly resilient and can efficiently handle high bandwidth loads, including Denial of Service attacks.
We built SABRE around two key technical insights. First, we leverage fundamental properties of inter-domain routing (BGP) policies to host relay nodes: (i) in networks that are inherently protected against routing attacks; and (ii) on paths that are economically-preferred by the majority of Bitcoin clients. These properties are generic and can be used to protect other Blockchain-based systems. Second, we leverage the fact that relaying blocks is communication-heavy, not computation-heavy. This enables us to offload most of the relay operations to programmable network hardware (using the P4 programming language). Thanks to this hardware/software co-design, SABRE nodes operate seamlessly under high load while mitigating the effects of malicious clients.
We present a complete implementation of SABRE together with an extensive evaluation. Our results demonstrate that SABRE is effective at securing Bitcoin against routing attacks, even with deployments of as few as 6 nodes.
-
Bingsheng Zhang (Lancaster University), Roman Oliynykov (IOHK Ltd.), Hamed Balogun (Lancaster University)
A treasury system is a community-controlled and decentralized collaborative decision-making mechanism for sustainable funding of blockchain development and maintenance. During each treasury period, project proposals are submitted, discussed, and voted for; top-ranked projects are funded from the treasury. The Dash governance system is a real-world example of such kind of systems. In this work, we, for the first time, provide a rigorous study of the treasury system. We modelled, designed, and implemented a provably secure treasury system that is compatible with most existing blockchain infrastructures, such as Bitcoin, Ethereum, etc. More specifically, the proposed treasury system supports liquid democracy/delegative voting for better collaborative intelligence. Namely, the stake holders can either vote directly on the proposed projects or delegate their votes to experts. Its core component is a distributed universally composable secure end-to-end verifiable voting protocol. The integrity of the treasury voting decisions is guaranteed even when all the voting committee members are corrupted. To further improve efficiency, we proposed the world’s first honest verifier zero-knowledge proof for unit vector encryption with logarithmic size communication. This partial result may be of independent interest to other cryptographic protocols. A pilot system is implemented in Scala over the Scorex 2.0 framework, and its benchmark results indicate that the proposed system can support tens of thousands of treasury participants with high efficiency.
-
David Derler (DFINITY), Kai Samelin (TÜV Rheinland i-sec GmbH), Daniel Slamanig (AIT Austrian Institute of Technology), Christoph Striecks (AIT Austrian Institute of Technology)
Blockchain technologies recently received a considerable amount of attention. While the initial focus was mainly on the use of blockchains in the context of cryptocurrencies such as Bitcoin, application scenarios now go far beyond this. Most blockchains have the property that once some object, e.g., a block or a transaction, has been registered to be included into the blockchain, it is persisted and there are no means to modify it again. While this is an essential feature of most blockchain scenarios, it is still often desirable---at times it may be even legally required--to allow for breaking this immutability in a controlled way.
Only recently, Ateniese et al. (EuroS&P 2017) proposed an elegant solution to this problem on the block level. Thereby, the authors replace standard hash functions with so-called chameleon-hashes (Krawczyk and Rabin, NDSS 2000). While their work seems to offer a suited solution to the problem of controlled re-writing of blockchains, their approach is too coarse-grained in that it only offers an all-or-nothing solution. We revisit this idea and introduce the novel concept of policy-based chameleon-hashes (PBCH). PBCHs generalize the notion of chameleon-hashes by giving the party computing a hash the ability to associate access policies to the generated hashes. Anyone who possesses enough privileges to satisfy the policy can then find arbitrary collisions for a given hash. We then apply this concept to transaction-level rewriting within blockchains, and thus support fine-grained and controlled modifiability of blockchain objects.
Besides modeling PBCHs, we present a generic construction of PBCHs (using a strengthened version of chameleon-hashes with ephemeral trapdoors which we also introduce), rigorously prove its security, and instantiate it with efficient building blocks. We report first implementation results.
-
Sourav Das (Department of Computer Science and Engineering, Indian Institute of Technology Delhi), Vinay Joseph Ribeiro (Department of Computer Science and Engineering, Indian Institute of Technology Delhi), Abhijeet Anand (Department of Computer Science and Engineering, Indian Institute of Technology Delhi)
One major shortcoming of permissionless blockchains such as Bitcoin and Ethereum is that they are unsuitable for running Computationally Intensive smart Contracts (CICs). This prevents such blockchains from running Machine Learning algorithms, Zero-Knowledge proofs, etc. which may need non-trivial computation.
In this paper, we present YODA, which is to the best of our knowledge the first solution for efficient computation of CICs in permissionless blockchains with guarantees for a threat model with both Byzantine and selfish nodes. YODA selects one or more execution sets (ES) via Sortition to execute a particular CIC off-chain. One key innovation is the MultI-Round Adaptive Consensus using Likelihood Estimation (MiRACLE) algorithm based on sequential hypothesis testing. MiRACLE allows the execution sets to be small thus making YODA efficient while ensuring correct CIC execution with high probability. It adapts the number of ES sets automatically depending on the concentration of Byzantine nodes in the system and is optimal in terms of the expected number of ES sets used in certain scenarios. Through a suite of economic incentives and technical mechanisms such as the novel Randomness Inserted Contract Execution (RICE) algorithm, we force selfish nodes to behave honestly. We also prove that the honest behavior of selfish nodes is an approximate Nash Equilibrium. We present the system design and details of YODA and prove the security properties of MiRACLE and RICE. Our prototype implementation built on top of Ethereum demonstrates the ability of YODA to run CICs with orders of magnitude higher gas per unit time as well as total gas requirements than Ethereum currently supports. It also demonstrates the low overheads of RICE.
-
Gabriel Kaptchuk (Johns Hopkins University), Matthew Green (Johns Hopkins University), Ian Miers (Cornell Tech)
In this work we investigate the problem of achieving secure computation by combining stateless trusted devices with public ledgers. We consider a hybrid paradigm in which a client-side device (such as a co-processor or trusted enclave) performs secure computation, while interacting with a public ledger via a possibly malicious host computer. We explore both the constructive and potentially destructive implications of such systems. We first show that this combination allows for the construction of stateful interactive functionalities (including general computation) even when the device has no persistent storage; this allows us to build sophisticated applications using inexpensive trusted hardware or even pure cryptographic obfuscation techniques. We further show how to use this paradigm to achieve censorship-resistant communication with a network, even when network communications are mediated by a potentially malicious host. Finally we describe a number of practical applications that can be achieved today. These include the synchronization of private smart contracts; rate limited mandatory logging; strong encrypted backups from weak passwords; enforcing fairness in multi-party computation; and destructive applications such as autonomous ransomware, which allows for payments without an online party.
-
Eihal Alowaisheq (Indiana University, King Saud University), Peng Wang (Indiana University), Sumayah Alrwais (King Saud University), Xiaojing Liao (Indiana University), XiaoFeng Wang (Indiana University), Tasneem Alowaisheq (Indiana University, King Saud University), Xianghang Mi (Indiana University), Siyuan Tang (Indiana University), Baojun Liu (Tsinghua University)
Take-down operations aim to disrupt cybercrime involving malicious domains. In the past decade, many successful take-down operations have been reported, including those against the Conficker worm, and most recently, against VPNFilter. Although it plays an important role in fighting cybercrime, the domain take-down procedure is still surprisingly opaque. There seems to be no in-depth understanding about how the take-down operation works and whether there is due diligence to ensure its security and reliability.
In this paper, we report the first systematic study on domain takedown. Our study was made possible via a large collection of data, including various sinkhole feeds and blacklists, passive DNS data spanning six years, and historical Whois information. Over these datasets, we built a unique methodology that extensively used various reverse lookups and other data analysis techniques to address the challenges in identifying taken-down domains, sinkhole operators, and take-down durations. Applying the methodology on the data, we discovered over 620K taken-down domains and conducted a longitudinal analysis on the take-down process, thus facilitating a better understanding of the operation and its weaknesses. We found that more than 14% of domains taken-down over the past ten months have been released back to the domain market and that some of the released domains have been repurchased by the malicious actor again before being captured and seized, either by the same or different sinkholes. In addition, we showed that the misconfiguration of DNS records corresponding to the sinkholed domains allowed us to hijack a domain that was seized by the FBI. Further, we found that expired sinkholes have caused the transfer of around 30K taken-down domains whose traffic is now under the control of new owners.
-
Orcun Cetin (Delft University of Technology), Carlos Gañán (Delft University of Technology), Lisette Altena (Delft University of Technology), Takahiro Kasama (National Institute of Information and Communications Technology), Daisuke Inoue (National Institute of Information and Communications Technology), Kazuki Tamiya (Yokohama National University), Ying Tie (Yokohama National University), Katsunari Yoshioka (Yokohama National University), Michel van Eeten (Delft…
With the rise of IoT botnets, the remediation of infected devices has become a critical task. As over 87% of these devices reside in broadband networks, this task will fall primarily to consumers and the Internet Service Providers. We present the first empirical study of IoT malware cleanup in the wild -- more specifically, of removing Mirai infections in the network of a medium-sized ISP. To measure remediation rates, we combine data from an observational study and a randomized controlled trial involving 220 consumers who suffered a Mirai infection together with data from honeypots and darknets. We find that quarantining and notifying infected customers via a walled garden, a best practice from ISP botnet mitigation for conventional malware, remediates 92% of the infections within 14 days. Email-only notifications have no observable impact compared to a control group where no notifications were sent. We also measure surprisingly high natural remediation rates of 58-74% for this control group and for two reference networks where users were also not notified. Even more surprising, reinfection rates are low. Only 5% of the customers who remediated suffered another infection in the five months after our first study. This stands in contrast to our lab tests, which observed reinfection of real IoT devices within minutes -- a discrepancy for which we explore various different possible explanations, but find no satisfactory answer. We gather data on customer experiences and actions via 76 phone interviews and the communications logs of the ISP. Remediation succeeds even though many users are operating from the wrong mental model -- e.g., they run anti-virus software on their PC to solve the infection of an IoT device. While quarantining infected devices is clearly highly effective, future work will have to resolve several remaining mysteries. Furthermore, it will be hard to scale up the walled garden solution because of the weak incentives of the ISPs.
-
Stephen Herwig (University of Maryland), Katura Harvey (University of Maryland, Max Planck Institute for Software Systems (MPI-SWS)), George Hughey (University of Maryland), Richard Roberts (University of Maryland, Max Planck Institute for Software Systems (MPI-SWS)), Dave Levin (University of Maryland)
The Internet of Things (IoT) introduces an unprecedented diversity and ubiquity to networked computing. It also introduces new attack surfaces that are a boon to attackers. The recent Mirai botnet showed the potential and power of a collection of compromised IoT devices. A new botnet, known as Hajime, targets many of the same devices as Mirai, but differs considerably in its design and operation. Hajime uses a public peer-to-peer system as its command and control infrastructure, and regularly introduces new exploits, thereby increasing its resilience.
We show that Hajime’s distributed design makes it a valuable tool for better understanding IoT botnets. For instance, Hajime cleanly separates its bots into different peer groups depending on their underlying hardware architecture. Through detailed measurement—active scanning of Hajime’s peer-to-peer infrastructure and passive, longitudinal collection of root DNS backscatter traffic—we show that Hajime can be used as a lens into how IoT botnets operate, what kinds of devices they compromise, and what countries are more (or less) susceptible. Our results show that there are more compromised IoT devices than previously reported; that these devices use an assortment of CPU architectures, the popularity of which varies widely by country; that churn is high among IoT devices; and that new exploits can quickly and drastically increase the size and power of IoT botnets. Our code and data are available to assist future efforts to measure and mitigate the growing threat of IoT botnets.
-
Suphannee Sivakorn (Columbia University), Kangkook Jee (NEC Labs America), Yixin Sun (Princeton University), Lauri Korts-Pärn (Cyber Defense Institute), Zhichun Li (NEC Labs America), Cristian Lumezanu (NEC Labs America), Zhenyu Wu (NEC Labs America), Lu-An Tang (NEC Labs America), Ding Li (NEC Labs America)
Modern malware and cyber attacks depend heavily on DNS services to make their campaigns reliable and difficult to track. Monitoring network DNS activities and blocking suspicious domains have been proven an effective technique in countering such attacks. However, recent successful campaigns reveal that at- tackers adapt by using seemingly benign domains and public web storage services to hide malicious activity. Also, the recent support for encrypted DNS queries provides attacker easier means to hide malicious traffic from network-based DNS monitoring.
We propose PDNS, an end-point DNS monitoring system based on DNS sensor deployed at each host in a network, along with a centralized backend analysis server. To detect such attacks, PDNS expands the monitored DNS activity context and examines process context which triggered that activity. Specifically, each deployed PDNS sensor matches domain name and the IP address related to the DNS query with process ID, binary signature, loaded DLLs, and code signing information of the program that initiated it. We evaluate PDNS on a DNS activity dataset collected from 126 enterprise hosts and with data from multiple malware sources. Using ML Classifiers including DNN, our results outperform most previous works with high detection accuracy: a true positive rate at 98.55% and a low false positive rate at 0.03%.
-
Jack Wampler (University of Colorado Boulder), Ian Martiny (University of Colorado Boulder), Eric Wustrow (University of Colorado Boulder)
Recently, the Spectre and Meltdown attacks revealed serious vulnerabilities in modern CPU designs, allowing
an attacker to exfiltrate data from sensitive programs. These
vulnerabilities take advantage of speculative execution to coerce
a processor to perform computation that would otherwise not
occur, leaking the resulting information via side channels to an
attacker.In this paper, we extend these ideas in a different direction,
and leverage speculative execution in order to hide malware from
both static and dynamic analysis. Using this technique, critical
portions of a malicious program’s computation can be shielded
from view, such that even a debugger following an instruction-
level trace of the program cannot tell how its results were
computed.We introduce ExSpectre, which compiles arbitrary malicious
code into a seemingly-benign payload binary. When a separate
trigger program runs on the same machine, it mistrains the CPU’s
branch predictor, causing the payload program to speculatively
execute its malicious payload, which communicates speculative
results back to the rest of the payload program to change its
real-world behavior.We study the extent and types of execution that can be
performed speculatively, and demonstrate several computations
that can be performed covertly. In particular, within speculative execution we are able to decrypt memory using AES-NI
instructions at over 11 kbps. Building on this, we decrypt and
interpret a custom virtual machine language to perform arbitrary
computation and system calls in the real world. We demonstrate
this with a proof-of-concept dial back shell, which takes only
a few milliseconds to execute after the trigger is issued. We
also show how our corresponding trigger program can be a pre-existing benign application already running on the system, and
demonstrate this concept with OpenSSL driven remotely by the
attacker as a trigger program.ExSpectre demonstrates a new kind of malware that evades
existing reverse engineering and binary analysis techniques. Because its true functionality is contained in seemingly unreachable
dead code, and its control flow driven externally by potentially
any other program running at the same time, ExSpectre poses a
novel threat to state-of-the-art malware analysis techniques.
-
Ahmed Salem (CISPA Helmholtz Center for Information Security), Yang Zhang (CISPA Helmholtz Center for Information Security), Mathias Humbert (Swiss Data Science Center, ETH Zurich/EPFL), Pascal Berrang (CISPA Helmholtz Center for Information Security), Mario Fritz (CISPA Helmholtz Center for Information Security), Michael Backes (CISPA Helmholtz Center for Information Security)
Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS). Recently, the first membership inference attack has shown that extraction of information on the training set is possible in such MLaaS settings, which has severe security and privacy implications.
However, the early demonstrations of the feasibility of such attacks have many assumptions on the adversary, such as using multiple so-called shadow models, knowledge of the target model structure, and having a dataset from the same distribution as the target model's training data. We relax all these key assumptions, thereby showing that such attacks are very broadly applicable at low cost and thereby pose a more severe risk than previously thought. We present the most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains.
In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.
-
Inken Hagestedt (CISPA Helmholtz Center for Information Security), Yang Zhang (CISPA Helmholtz Center for Information Security), Mathias Humbert (Swiss Data Science Center, ETH Zurich/EPFL), Pascal Berrang (CISPA Helmholtz Center for Information Security), Haixu Tang (Indiana University Bloomington), XiaoFeng Wang (Indiana University Bloomington), Michael Backes (CISPA Helmholtz Center for Information Security)
The advancement of molecular profiling techniques fuels biomedical research with a deluge of data. To facilitate data sharing, the Global Alliance for Genomics and Health established the Beacon system, a search engine designed to help researchers find datasets of interest. While the current Beacon system only supports genomic data, other types of biomedical data, such as DNA methylation, are also essential for advancing our understanding in the field. In this paper, we propose the first Beacon system for DNA methylation data sharing: MBeacon. As the current genomic Beacon is vulnerable to privacy attacks, such as membership inference, and DNA methylation data is highly sensitive, we take a privacy-by-design approach to construct MBeacon. First, we demonstrate the privacy threat, by proposing a membership inference attack tailored specifically to unprotected methylation Beacons. Our experimental results show that 100 queries are sufficient to achieve a successful attack with AUC (area under the ROC curve) above 0.9. To remedy this situation, we propose a novel differential privacy mechanism, namely SVT^2, which is the core component of MBeacon. Extensive experiments over multiple datasets show that SVT^2 can successfully mitigate membership privacy risks without significantly harming utility. We further implement a fully functional prototype of MBeacon which we make available to the research community.
-
Shasha Li (University of California Riverside), Ajaya Neupane (University of California Riverside), Sujoy Paul (University of California Riverside), Chengyu Song (University of California Riverside), Srikanth V. Krishnamurthy (University of California Riverside), Amit K. Roy Chowdhury (University of California Riverside), Ananthram Swami (United States Army Research Laboratory)
Recent research has demonstrated the brittleness of machine learning systems to adversarial perturbations. However, the studies have been mostly limited to perturbations on images and more generally, classification tasks that do not deal with real-time stream inputs. In this paper we ask ”Are adversarial perturbations that cause misclassification in real-time video classification systems possible, and if so what properties must they satisfy?” Real-time video classification systems find application in surveillance applications, smart vehicles, and smart elderly care and thus, misclassification could be particularly harmful (e.g., a mishap at an elderly care facility may be missed). Video classification systems take video clips as inputs and these clip boundaries are not deterministic. We show that perturbations that do not take “the indeterminism in the clip boundaries input to the video classifier” into account, do not achieve high attack success rates. We propose novel approaches for generating 3D adversarial perturbations (perturbation clips) that exploit recent advances in generative models to not only overcome this key challenge but also provide stealth. In particular, our most potent 3D adversarial perturbations cause targeted activities in video streams to be misclassified with rates over 80%. At the same time, they also ensure that the perturbations leave other (untargeted) activities largely unaffected making them extremely stealthy. Finally, we also derive a single-frame (2D) perturbation that can be applied to every frame in a video stream, and which in many cases, achieves extremely high misclassification rates.
-
Shiqing Ma (Purdue University), Yingqi Liu (Purdue University), Guanhong Tao (Purdue University), Wen-Chuan Lee (Purdue University), Xiangyu Zhang (Purdue University)
Deep Neural Networks (DNN) are vulnerable to adversarial samples that
are generated by perturbing correctly classified inputs to cause DNN
models to misbehave (e.g., misclassification). This can potentially
lead to disastrous consequences especially in security-sensitive
applications. Existing defense and detection techniques work well for
specific attacks under various assumptions (e.g., the set of possible
attacks are known beforehand). However, they are not sufficiently
general to protect against a broader range of attacks. In this paper,
we analyze the internals of DNN models under various attacks and
identify two common exploitation channels: the provenance channel and
the activation value distribution channel. We then propose a novel
technique to extract DNN invariants and use them to perform runtime
adversarial sample detection. Our experimental results of 11 different
kinds of attacks on popular datasets including ImageNet and 13 models
show that our technique can effectively detect all these attacks
(over 90% accuracy) with limited false positives. We also compare it
with three state-of-the-art techniques including the Local Intrinsic
Dimensionality (LID) based method, denoiser based methods (i.e.,
MagNet and HGD), and the prediction inconsistency based approach
(i.e., feature squeezing). Our experiments show promising results. -
Jinfeng Li (Zhejiang University), Shouling Ji (Zhejiang University), Tianyu Du (Zhejiang University), Bo Li (University of California, Berkeley), Ting Wang (Lehigh University)
Deep Learning-based Text Understanding (DLTU) is the backbone technique behind various applications, including question answering, machine translation, and text classification. Despite its tremendous popularity, the security vulnerabilities of DLTU are still largely unknown, which is highly concerning given its increasing use in security-sensitive applications such as user sentiment analysis and toxic content detection. In this paper, we show that DLTU is inherently vulnerable to adversarial text attacks, in which maliciously crafted text triggers target DLTU systems and services to misbehave. Specifically, we present TextBugger, a general attack framework for generating adversarial text. In contrast of prior work, TextBugger differs in significant ways: (i) effective -- it outperforms state-of-the-art attacks in terms of attack success rate; (ii) evasive -- it preserves the utility of benign text, with 94.9% of the adversarial text correctly recognized by human readers; and (iii) efficient -- it generates adversarial text with computational complexity sub-linear to the text length. We empirically evaluate TextBugger on a set of real-world DLTU systems and services used for sentiment analysis and toxic content detection, demonstrating its effectiveness, evasiveness, and efficiency. For instance, TextBugger achieves 100% success rate on the IMDB dataset based on Amazon AWS Comprehend within 4.61 seconds and preserves 97% semantic similarity. We further discuss possible defense mechanisms to mitigate such attack and the adversary's potential countermeasures, which leads to promising directions for further research.
-
Luis Vargas (University of Florida), Logan Blue (University of Florida), Vanessa Frost (University of Florida), Christopher Patton (University of Florida), Nolen Scaife (University of Florida), Kevin R.B. Butler (University of Florida), Patrick Traynor (University of Florida)
Modern hospital systems are complex environments that rely on high interconnectivity with the larger Internet. With this connectivity comes a vast attack surface. Security researchers have expended considerable effort to characterize the risks posed to medical devices (e.g., pacemakers and insulin pumps). However, there has been no systematic, ecosystem-wide analyses of a modern hospital system to date, perhaps due to the challenges of collecting and analyzing sensitive healthcare data. Hospital traffic requires special considerations because healthcare data may contain private information or may come from safety-critical devices in charge of patient care. We describe the process of obtaining the network data in a safe and ethical manner in order to help expand future research in this field. We present an analysis of network-enabled devices connected to the hospital used for its daily operations without posing any harm to the hospital’s environment. We perform a Digital Healthcare- Associated Infection (D-HAI) analysis of the hospital ecosystem, assessing a major multi-campus healthcare system over a period of six months. As part of the D-HAI analysis, we characterize DNS requests and TLS/SSL communications to better understand the threats faced within the hospital environment without disturbing the operational network. Contrary to past assumptions, we find that medical devices have minimal exposure to the external Internet, but that medical support devices (e.g., servers, computer terminals) essential for daily hospital operations are much more exposed. While much of this communication appears to be benign, we discover evidence of insecure and broken cryptography and misconfigured devices, and potential botnet activity. Analyzing the network ecosystem in which they operate gives us an insight into the weaknesses and misconfigurations hospitals need to address to ensure the safety and privacy of patients.
-
Platon Kotzias (IMDEA Software Institute, Universidad Politécnica de Madrid), Leyla Bilge (Symantec Research Labs), Pierre-Antoine Vervier (Symantec Research Labs), Juan Caballero (IMDEA Software Institute)
Enterprises own a significant fraction of the hosts connected to the Internet and possess valuable assets, such as financial data and intellectual property, which may be targeted by attackers.
They suffer attacks that exploit unpatched hosts and install malware, resulting in breaches that may cost millions in damages. Despite the scale of this phenomenon, the threat and vulnerability landscape of enterprises remains under-studied. The security posture of enterprises remains unclear, and it's unknown whether enterprises are indeed more secure than consumer hosts.
To address these questions, we perform the largest and longest enterprise security study up to date. Our data covers nearly 3 years and is collected from 28K enterprises, belonging to 67 industries, which own 82M hosts and 73M public-facing servers.Our measurements comprise of two parts: an analysis of the threat landscape and an analysis of the enterprise vulnerability patching behavior.
The threat landscape analysis first classifies low reputation files observed in enterprise hosts into families. Then, it measures, among others, that 91%--97% of the enterprises, 13%--41% of the enterprise hosts, encountered at least one malware or PUP file over the length of our study;
that enterprises encounter malware much more often than PUP; and that some industries like banks and consumer finances are doing notoriously better, achieving significantly lower malware and PUP encounter rates than the most-affected industries.
The vulnerability analysis examines the patching of 12 client-side and 112 server-side applications in enterprise hosts and servers. It measures, among others, that it takes over 6 months on average to patch 90% of the population across all vulnerabilities in the 12
client-side applications; that enterprise computers are faster to patch vulnerabilities compared to consumer hosts; and that the patching of server applications is much worse than the patching of client-side applications. -
Wajih Ul Hassan (NEC Laboratories America, Inc.; University of Illinois at Urbana–Champaign), Shengjian Guo (Virginia Tech), Ding Li (NEC Laboratories America, Inc.), Zhengzhang Chen (NEC Laboratories America, Inc.), Kangkook Jee (NEC Laboratories America, Inc.), Zhichun Li (NEC Laboratories America, Inc.), Adam Bates (University of Illinois at Urbana–Champaign)
Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a “threat alert fatigue” or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms.
In this paper, we present NoDoze to combat this challenge using contextual and historical information of generated threat alert in an enterprise. NoDoze first generates a causal dependency graph of an alert event. Then, it assigns an anomaly score to each event in the dependency graph based on the frequency with which related events have happened before in the enterprise. NoDoze then propagates those scores along the edges of the graph using a novel network diffusion algorithm and generates a subgraph with an aggregate anomaly score which is used to triage alerts. Evaluation on our dataset of 364 threat alerts shows that NoDoze decreases the volume of false alarms by 86%, saving more than 90 hours of analysts’ time, which was required to investigate those false alarms. Furthermore, NoDoze generated dependency graphs of true alerts are 2 orders of magnitude smaller than those generated by traditional tools without sacrificing the vital information needed for the investigation. Our system has a low average runtime overhead and can be deployed with any threat detection software.
-
Sergey Frolov (University of Colorado Boulder), Eric Wustrow (University of Colorado Boulder)
TLS, the Transport Layer Security protocol, has quickly become the most popular protocol on the Internet, already used to load over 70% of web pages in Mozilla Firefox. Due to its ubiquity, TLS is also a popular protocol for censorship circumvention tools, including Tor and Signal, among others.
However, the wide range of features supported in TLS makes it possible to distinguish implementations from one another by what set of cipher suites, elliptic curves, signature algorithms, and other extensions they support. Already, censors have used deep packet inspection (DPI) to identify and block popular circumvention tools based on the fingerprint of their TLS implementation.
In response, many circumvention tools have attempted to mimic popular TLS implementations such as browsers, but this technique has several challenges. First, it is burdensome to keep up with the rapidly-changing browser TLS implementations, and know what fingerprints would be good candidates to mimic. Second, TLS implementations can be difficult to mimic correctly, as they offer many features that may not be supported by the relatively lightweight libraries used in typical circumvention tools. Finally, dependency changes and updates to the underlying libraries can silently impact what an application’s TLS fingerprint looks like, making it difficult for tools to control.
In this paper, we collect and analyze real-world TLS traffic from over 11.8 billion TLS connections over 9 months to identify a wide range of TLS client implementations actually used on the Internet. We use our data to analyze TLS implementations of several popular censorship circumvention tools, including Lantern, Psiphon, Signal, Outline, Tapdance, and Tor (Snowflake and meek). We find that the many of these tools use TLS configurations that are easily distinguishable from the real-world traffic they attempt to mimic, even when these tools have put effort into parroting popular TLS implementations.
To address this problem, we have developed a library, uTLS, that enables tool maintainers to automatically mimic other popular TLS implementations. Using our real-world traffic dataset, we observe many popular TLS implementations we are able to correctly mimic with uTLS, and we describe ways our tool can more flexibly adopt to the dynamic TLS ecosystem with minimal manual effort.
-
Katharina Kohls (Ruhr-University Bochum), Kai Jansen (Ruhr-University Bochum), David Rupprecht (Ruhr-University Bochum), Thorsten Holz (Ruhr-University Bochum), Christina Pöpper (New York University Abu Dhabi)
Traffic-analysis attacks are a persisting threat for Tor users. When censors or law enforcement agencies try to identify users, they conduct traffic-confirmation attacks and monitor encrypted transmissions to extract metadata—in combination with routing attacks, these attacks become sufficiently powerful to de-anonymize users. While traffic-analysis attacks are hard to detect and expensive to counter in practice, geographical avoidance provides an option to reject circuits that might be routed through an untrusted area. Unfortunately, recently proposed solutions introduce severe security issues by imprudent design decisions.
In this paper, we approach geographical avoidance starting from a thorough assessment of its challenges. These challenges serve as the foundation for the design of an empirical avoidance concept that considers actual transmission characteristics for justified decisions. Furthermore, we address the problems of untrusted or intransparent ground truth information that hinder a reliable assessment of circuits. Taking these features into account, we conduct an empirical simulation study and compare the performance of our novel avoidance concept with existing
approaches. Our results show that we outperform existing systems by 22 % fewer rejected circuits, which reduces the collateral damage of overly restrictive avoidance decisions. In a second evaluation step, we extend our initial system concept and implement the prototype MultilateraTor. This prototype is the first to satisfy the requirements of a practical deployment, as it maintains Tor’s original level of security, provides reasonable performance, and overcomes the fundamental security flaws of existing systems.
Tuesday, 26 February
-
Dokyung Song (University of California, Irvine), Felicitas Hetzelt (Technical University of Berlin), Dipanjan Das (University of California, Santa Barbara), Chad Spensky (University of California, Santa Barbara), Yeoul Na (University of California, Irvine), Stijn Volckaert (University of California, Irvine and KU Leuven), Giovanni Vigna (University of California, Santa Barbara), Christopher Kruegel (University of California, Santa Barbara),…
The OS kernel is an attractive target for remote attackers. If compromised, the kernel gives adversaries full system access, including the ability to install rootkits, extract sensitive information, and perform other malicious actions, all while evading detection. Most of the kernel's attack surface is situated along the system call boundary. Ongoing kernel protection efforts have focused primarily on securing this boundary; several capable analysis and fuzzing frameworks have been developed for this purpose.
However, there are additional paths to kernel compromise that do not involve system calls, as demonstrated by several recent exploits. For example, by compromising the firmware of a peripheral device such as a Wi-Fi chipset and subsequently sending malicious inputs from the Wi-Fi chipset to the Wi-Fi driver, adversaries have been able to gain control over the kernel without invoking a single system call. Unfortunately, there are currently no practical probing and fuzzing frameworks that can help developers find and fix such vulnerabilities occurring along the hardware-OS boundary.
We present PeriScope, a Linux kernel based probing framework that enables fine-grained analysis of device-driver interactions. PeriScope hooks into the kernel's page fault handling mechanism to either passively monitor and log traffic between device drivers and their corresponding hardware, or mutate the data stream on-the-fly using a fuzzing component, PeriFuzz, thus mimicking an active adversarial attack. PeriFuzz accurately models the capabilities of an attacker on peripheral devices, to expose different classes of bugs including, but not limited to, memory corruption bugs and double-fetch bugs. To demonstrate the risk that peripheral devices pose, as well as the value of our framework, we have evaluated PeriFuzz on the Wi-Fi drivers of two popular chipset vendors, where we discovered 15 unique vulnerabilities, 9 of which were previously unknown.
-
Cornelius Aschermann (Ruhr-Universität Bochum), Sergej Schumilo (Ruhr-Universität Bochum), Tim Blazytko (Ruhr-Universität Bochum), Robert Gawlik (Ruhr-Universität Bochum), Thorsten Holz (Ruhr-Universität Bochum)
Automated software testing based on fuzzing has experienced a revival in recent years. Especially feedback-driven fuzzing has become well-known for its ability to efficiently perform randomized testing with limited input corpora.
Despite a lot of progress, two common problems are magic numbers and (nested) checksums. Computationally expensive methods such as taint tracking and symbolic execution are typically used to overcome such roadblocks. Unfortunately, such methods often require access to source code, a rather precise description of the environment (e.g., behavior of library calls or the underlying OS), or the exact semantics of the platform's instruction set.In this paper, we introduce a lightweight, yet very effective alternative to taint tracking and symbolic execution to facilitate and optimize state-of-the-art feedback fuzzing that easily scales to large binary applications and unknown environments.
We observe that during the execution of a given program, parts of the input often end up directly (i.e., nearly unmodified) in the program state. This input-to-state correspondence
can be exploited to create a robust method to
overcome common fuzzing roadblocks in a highly effective and efficient manner.
Our prototype implementation, called REDQUEEN, is able to solve magic bytes and (nested) checksum tests automatically for a given binary executable.
Additionally, we show that our techniques outperform various
state-of-the-art tools on a wide variety of targets across different privilege levels (kernel-space and userland) with no platform-specific code.
REDQUEEN is the first method to find more than 100% of the bugs planted in LAVA-M across all targets. Furthermore, we were able to discover 65 new bugs and obtained 16 CVEs in multiple programs and OS kernel drivers. Finally, our evaluation demonstrates that REDQUEEN is fast, widely applicable and outperforms concurrent approaches by up to three orders of magnitude. -
Cornelius Aschermann (Ruhr-Universität Bochum), Tommaso Frassetto (Technische Universität Darmstadt), Thorsten Holz (Ruhr-Universität Bochum), Patrick Jauernig (Technische Universität Darmstadt), Ahmad-Reza Sadeghi (Technische Universität Darmstadt), Daniel Teuchert (Ruhr-Universität Bochum)
Fuzzing is a well-known method for efficiently identifying bugs in programs.
Unfortunately, when fuzzing targets that require highly-structured inputs such as interpreters, many fuzzing methods struggle to pass the syntax checks.
More specifically, interpreters often process inputs in multiple stages: first syntactic, then semantic correctness is checked. Only if these checks are passed, the interpreted code gets executed.
This prevents fuzzers from executing ``deeper'' --- and hence potentially more interesting --- code.
Typically two valid inputs that lead to the execution of different features in the target application require too many mutations for simple mutation-based fuzzers to discover: making small changes like bit flips usually only leads to the execution of error paths in the parsing engine.
So-called grammar fuzzers are able to pass the syntax checks by using Context-Free Grammars.
Using feedback can significantly increase the efficiency of fuzzing engines.
Hence, it is commonly used in state-of-the-art mutational fuzzers that do not use grammars.
Yet, grammar fuzzers do not make use of code coverage, i.e., they do not know whether any input triggers new functionality or not.In this paper, we propose NAUTILUS, a method to efficiently fuzz programs that require highly-structured inputs by combining the use of grammars with the use of code coverage feedback.
This allows us to recombine aspects of interesting inputs that were learned individually, and to dramatically increase the probability that any generated input will be accepted by the parser.
We implemented a proof-of-concept fuzzer that we tested on multiple targets, including ChakraCore (the JavaScript engine of Microsoft Edge), PHP, mruby, and Lua.
NAUTILUS identified multiple bugs in all of the targets: Seven in mruby, three in PHP, two in ChakraCore, and one in Lua.
Reporting these bugs was awarded with a sum of 2600 USD and 6 CVEs were assigned.
Our experiments show that combining context-free grammars and feedback-driven fuzzing significantly outperforms state-of-the-art approaches like American Fuzzy Lop (AFL) by an order of magnitude and grammar fuzzers by more than a factor of two when measuring code coverage. -
Sze Yiu Chau (Purdue University), Moosa Yahyazadeh (The University of Iowa), Omar Chowdhury (The University of Iowa), Aniket Kate (Purdue University), Ninghui Li (Purdue University)
We discuss how symbolic execution can be used to not only find low-level errors but also analyze the semantic correctness of protocol implementations. To avoid manually crafting test cases, we propose a strategy of meta-level search, which leverages constraints stemmed from the input formats to automatically generate concolic test cases. Additionally, to aid root-cause analysis, we develop constraint provenance tracking (CPT), a mechanism that associates atomic sub-formulas of path constraints with their corresponding source level origins. We demonstrate the power of symbolic analysis with a case study on PKCS#1 v1.5 signature verification. Leveraging meta-level search and CPT, we analyzed 15 recent open-source implementations using symbolic execution and found semantic flaws in 6 of them. Further analysis of these flaws showed that 4 implementations are susceptible to new variants of the Bleichenbacher low- exponent RSA signature forgery. One implementation suffers from potential denial of service attacks with purposefully crafted signatures. All our findings have been responsibly shared with the affected vendors. Among the flaws discovered, 6 new CVEs have been assigned to the immediately exploitable ones.
-
Lei Zhao (Wuhan University), Yue Duan (University of California, Riverside), Heng Yin (University of California, Riverside), Jifeng Xuan (Wuhan University)
Hybrid fuzzing which combines fuzzing and concolic execution has become an advanced technique for software vulnerability detection. Based on the observation that fuzzing and concolic execution are complementary in nature, the state-of-the-art hybrid fuzzing systems deploy ``demand launch'' and ``optimal switch'' strategies. Although these ideas sound intriguing, we point out several fundamental limitations in them, due to oversimplified assumptions. We then propose a novel ``discriminative dispatch'' strategy to better utilize the capability of concolic execution. We design a novel Monte Carlo based probabilistic path prioritization model to quantify each path's difficulty and prioritize them for concolic execution. This model treats fuzzing as a random sampling process. It calculates each path's probability based on the sampling information. Finally, our model prioritizes and assigns the most difficult paths to concolic execution. We implement a prototype system DigFuzz and evaluate our system with two representative datasets. Results show that the concolic execution in DigFuzz outperforms than that in a state-of-the-art hybrid fuzzing system Driller in every major aspect. In particular, the concolic execution in DigFuzz contributes to discovering more vulnerabilities (12 vs. 5) and producing more code coverage (18.9% vs. 3.8%) on the CQE dataset than the concolic execution in Driller.
-
Athanasios Andreou (EURECOM), Márcio Silva (UFMG), Fabrício Benevenuto (UFMG), Oana Goga (Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG), Patrick Loiseau (Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG & MPI-SWS), Alan Mislove (Northeastern University)
The Facebook advertising platform has been subject to a number of controversies in the past years regarding privacy violations, lack of transparency, as well as its capacity to be used by dishonest actors for discrimination or propaganda. In this study, we aim to provide a better understanding of the Facebook advertising ecosystem, focusing on how it is being used by advertisers. We first analyze the set of advertisers and then investigate how those advertisers are targeting users and customizing ads via the platform. Our analysis is based on the data we collected from over 600 real-world users via a browser extension that collects the ads our users receive when they browse their Facebook timeline, as well as the explanations for why users received these ads.
Our results reveal that users are targeted by a wide range of advertisers (e.g., from popular to niche advertisers); that a non-negligible fraction of advertisers are part of potentially sensitive categories such as news and politics, health or religion; that a significant number of advertisers employ targeting strategies that could be either invasive or opaque; and that many advertisers use a variety of targeting parameters and ad texts. Overall, our work emphasizes the need for better mechanisms to audit ads and advertisers in social media and provides an overview of the platform usage that can help move towards such mechanisms.
-
Martin Degeling (Ruhr-Universität Bochum), Christine Utz (Ruhr-Universität Bochum), Christopher Lentzsch (Ruhr-Universität Bochum), Henry Hosseini (Ruhr-Universität Bochum), Florian Schaub (University of Michigan), Thorsten Holz (Ruhr-Universität Bochum)
The European Union’s General Data Protection Regulation (GDPR) went into effect on May 25, 2018. Its privacy regulations apply to any service and company collecting or processing personal data in Europe. Many companies had to adjust their data handling processes, consent forms, and privacy policies to comply with the GDPR’s transparency requirements. We monitored this rare event by analyzing changes on popular websites in all 28 member states of the European Union. For each country, we periodically examined its 500 most popular websites – 6,579 in total – for the presence of and updates to their privacy policy between December 2017 and October 2018. While many websites already had privacy policies, we find that in some countries up to 15.7 % of websites added new privacy policies by May 25, 2018, resulting in 84.5 % of websites having privacy policies. 72.6 % of websites with existing privacy policies updated them close to the date. After May this positive development slowed down noticeably. Most visibly, 62.1 % of websites in Europe now display cookie consent notices, 16 % more than in January 2018. These notices inform users about a site’s cookie use and user tracking practices. We categorized all observed cookie consent notices and evaluated 28 common implementations with respect to their technical realization of cookie consent. Our analysis shows that core web security mechanisms such as the same-origin policy pose problems for the implementation of consent according to GDPR rules, and opting out of third-party cookies requires the third party to cooperate. Overall, we conclude that the web became more transparent at the time GDPR came into force, but there is still a lack of both functional and usable mechanisms for users to consent to or deny processing of their personal data on the Internet.
-
Michael Meli (North Carolina State University), Matthew R. McNiece (Cisco Systems and North Carolina State University), Bradley Reaves (North Carolina State University)
GitHub and similar platforms have made public collaborative development of software commonplace. However, a problem arises when this public code must manage authentication secrets, such as API keys or cryptographic secrets. These secrets must be kept private for security, yet common development practices like adding these secrets to code make accidental leakage frequent. In this paper, we present the first large-scale and longitudinal analysis of secret leakage on GitHub. We examine billions of files collected using two complementary approaches: a nearly six-month scan of real-time public GitHub commits and a public snapshot covering 13% of open-source repositories. We focus on private key files and 11 high-impact platforms with distinctive API key formats. This focus allows us to develop conservative detection techniques that we manually and automatically evaluate to ensure accurate results. We find that not only is secret leakage pervasive — affecting over 100,000 repositories— but that thousands of new, unique secrets are leaked every day. We also use our data to explore possible root causes of leakage and to evaluate potential mitigation strategies. This work shows that secret leakage on public repository platforms is rampant and far from a solved problem, placing developers and services at persistent risk of compromise and abuse.
-
Amit Klein (Bar Ilan University), Benny Pinkas (Bar Ilan University)
We describe a novel user tracking technique that is based on assigning statistically unique DNS records per user. This new tracking technique is unique in being able to distinguish between machines that have identical hardware and software, and track users even if they use “privacy mode” browsing, or use multiple browsers (on the same machine).
The technique overcomes issues related to the caching of DNS answers in resolvers, and utilizes per-device caching of DNS answers at the client. We experimentally demonstrate that it covers the technologies used by a very large fraction of Internet users (in terms of browsers, operating systems, and DNS resolution platforms).
Our technique can track users for up to a day (typically), and therefore works best when combined with other, narrower yet longer-lived techniques such as regular cookies - we briefly
explain how to combine such techniques.
We suggest mitigations to this tracking technique but note that it is not easily mitigated. There are possible workarounds, yet these are not without setup overhead, performance overhead or convenience overhead. A complete mitigation requires software modifications in both browsers and resolver software. -
Muhammad Ahmad Bashir (Northeastern University), Umar Farooq (LUMS Pakistan), Maryam Shahid (LUMS Pakistan), Muhammad Fareed Zaffar (LUMS Pakistan), Christo Wilson (Northeastern University)
Widely reported privacy issues concerning major online advertising platforms (e.g., Facebook) have heightened concerns among users about the data that is collected about them. However, while we have a comprehensive understanding who collects data on users, as well as how tracking is implemented, there is still a significant gap in our understanding: what information do advertisers actually infer about users, and is this information accurate?
In this study, we leverage Ad Preference Managers (APMs) as a lens through which to address this gap. APMs are transparency tools offered by some advertising platforms that allow users to see the interest profiles that are constructed about them. We recruited 220 participants to install an IRB approved browser extension that collected their interest profiles from four APMs (Google, Facebook, Oracle BlueKai, and Neilsen eXelate), as well as behavioral and survey data. We use this data to analyze the size and correctness of interest profiles, compare their composition across the four platforms, and investigate the origins of the data underlying these profiles.
-
A. Theodore Markettos (University of Cambridge), Colin Rothwell (University of Cambridge), Brett F. Gutstein (Rice University), Allison Pearce (University of Cambridge), Peter G. Neumann (SRI International), Simon W. Moore (University of Cambridge), Robert N. M. Watson (University of Cambridge)
Direct Memory Access (DMA) attacks have been known for many years: DMA-enabled I/O peripherals have complete access to the state of a computer and can fully compromise it including reading and writing all of system memory.
With the popularity of Thunderbolt 3 over USB Type-C and smart internal devices, opportunities for these attacks to be performed casually with only seconds of physical access to a computer have greatly broadened. In response, commodity hardware and operating-system (OS) vendors have incorporated support for Input-Output Memory Management Units (IOMMUs), which impose memory protection on DMA, and are widely believed to protect against DMA attacks.
We investigate the state-of-the-art in IOMMU protection across OSes using a novel *I/O security research platform*, and find that current protections fall short when faced with a functional network peripheral that uses its complex interactions with the OS for ill intent, and demonstrate compromises against macOS, FreeBSD, and Linux, which notionally utilize IOMMUs to protect against DMA attackers. Windows only uses the IOMMU in limited cases and remains vulnerable.
Using Thunderclap, an open-source FPGA research platform we built, we explore a number of novel exploit techniques to expose new classes of OS vulnerability. The complex vulnerability space for IOMMU-exposed shared memory available to DMA-enabled peripherals allows attackers to extract private data (sniffing cleartext VPN traffic) and hijack kernel control flow (launching a root shell) in seconds using devices such as USB-C projectors or power adapters.
We have worked closely with OS vendors to remedy these vulnerability classes, and they have now shipped substantial feature improvements and mitigations as a result of our work.
-
Zheng Leong Chua (National University of Singapore), Yanhao Wang (TCA/SKLCS, Institute of Software, Chinese Academy of Sciences), Teodora Baluta (National University of Singapore), Prateek Saxena (National University of Singapore), Zhenkai Liang (National University of Singapore), Purui Su (TCA/SKLCS, Institute of Software, Chinese Academy of Sciences)
Dynamic binary taint analysis has wide applications in the security analysis of commercial-off-the-shelf (COTS) binaries. One of the key challenges in dynamic binary analysis is to specify the taint rules that capture how taint information propagates for each instruction on an architecture. Most of the existing solutions specify taint rules using a deductive approach by summarizing the rules manually after analyzing the instruction semantics. Intuitively, taint propagation reflects on how an instruction input affects its output and thus can be observed from instruction executions. In this work, we propose an inductive method for taint propagation and develop a universal taint tracking engine that is architecture-agnostic. Our taint engine, TAINTINDUCE, can learn taint rules with minimal architectural knowledge by observing the execution behavior of instructions. To measure its correctness and guide taint rule generation, we define the precise notion of soundness for bit-level taint tracking in this novel setup. In our evaluation, we show that TAINT INDUCE automatically learns rules for 4 widely used architectures: x86, x64, AArch64, and MIPS-I. It can detect vulnerabilities for 24 CVEs in 15 applications on both Linux and Windows over millions of instructions and is comparable with other mature existing tools (TEMU [51], libdft [32], Triton [42]). TAINTINDUCE can be used as a standalone taint engine or be used to complement existing taint engines for unhandled instructions. Further, it can be used as a cross-referencing tool to uncover bugs in taint engines, emulation implementations and ISA documentations.
-
Ruian Duan (Georgia Institute of Technology), Ashish Bijlani (Georgia Institute of Technology), Yang Ji (Georgia Institute of Technology), Omar Alrawi (Georgia Institute of Technology), Yiyuan Xiong (Peking University), Moses Ike (Georgia Institute of Technology), Brendan Saltaformaggio (Georgia Institute of Technology), Wenke Lee (Georgia Institute of Technology)
Mobile application developers rely heavily on open-source software (OSS)
to offload common functionalities such as the implementation of
protocols and media format playback. Over the past years, several
vulnerabilities have been found in popular open-source libraries like
OpenSSL and FFmpeg. Mobile applications that include such libraries
inherit these flaws, which make them vulnerable. Fortunately, the
open-source community is responsive and patches are made available
within days. However, mobile application developers are often left
unaware of these flaws. The App Security Improvement Program (ASIP) is
a commendable effort by Google to notify application developers of these
flaws, but recent work has shown that many developers do not act on this
information.Our work addresses vulnerable mobile applications through automatic
binary patching from source patches provided by the OSS maintainers and
without involving the developers. We propose novel techniques to
overcome difficult challenges like patching feasibility analysis,
source-code-to-binary-code matching, and in-memory patching. Our
technique uses a novel variability-aware approach, which we implement as
OSSPatcher. We evaluated OSSPatcher with 39 OSS and a collection of
1,000 Android applications using their vulnerable versions. OSSPatcher
generated 675 function-level patches that fixed the affected mobile
applications without breaking their binary code. Further, we evaluated
10 vulnerabilities in popular apps such as Chrome with public exploits,
which OSSPatcher was able to mitigate and thwart their exploitation. -
Jangseop Shin (Seoul National University and Inter-University Semiconductor Research Center), Donghyun Kwon (Seoul National University and Inter-University Semiconductor Research Center), Jiwon Seo (Seoul National University and Inter-University Semiconductor Research Center), Yeongpil Cho (Soongsil University), Yunheung Paek (Seoul National University and Inter-University Semiconductor Research Center)
Pointer invalidation has been a popular approach adopted in many recent studies to mitigate use-after-free errors. The approach can be divided largely into two different schemes: explicit invalidation and implicit invalidation. The former aims to eradicate the root cause of use-after-free errors by invalidating every dangling pointer one by one explicitly. In contrast, the latter aims to prevent dangling pointers by freeing an object only if there is no pointer referring to it. A downside of the explicit scheme is that it is expensive, as it demands high-cost algorithms or a large amount of space to maintain every up-to-date list of pointer locations linking to each object at all times. Implicit invalidation is more efficient in that even without any explicit effort, it can eliminate dangling pointers by leaving objects undeleted until all the links between the objects and their referring pointers vanish by themselves during program execution. However, such an argument only holds if the scheme knows exactly when each link is created and deleted. Reference counting is a traditional method to determine the existence of reference links between objects and pointers. Unfortunately, impeccable reference counting for legacy C/C++ code is very difficult and expensive to achieve in practice, mainly because of the type unsafe operations in the code. In this paper, we present a solution, called CRCount, to the use-after-free problem in legacy C/C++. For effective and efficient problem solving, CRCount is armed with the pointer footprinting technique that enables us to compute, with high accuracy, the reference count of every object referred to by the pointers in the legacy code. Our experiments demonstrate that CRCount mitigates the use-after-free errors with a lower performance-wise and space-wise overhead than the existing pointer invalidation solutions.
-
HyungSeok Han (KAIST), DongHyeon Oh (KAIST), Sang Kil Cha (KAIST)
JavaScript engines are an attractive target for attackers due to their popularity and flexibility in building exploits. Current state-of-the-art fuzzers for finding JavaScript engine vulnerabilities focus mainly on generating syntactically correct test cases based on either a predefined context-free grammar or a trained probabilistic language model. Unfortunately, syntactically correct JavaScript sentences are often semantically invalid at runtime. Furthermore, statically analyzing the semantics of JavaScript code is challenging due to its dynamic nature: JavaScript code is generated at runtime, and JavaScript expressions are dynamically-typed. To address this challenge, we propose a novel test case generation algorithm that we call semantics-aware assembly, and implement it in a fuzz testing tool termed CodeAlchemist. Our tool can generate arbitrary JavaScript code snippets that are both semantically and syntactically correct, and it effectively yields test cases that can crash JavaScript engines. We found numerous vulnerabilities of the latest JavaScript engines with CodeAlchemist and reported them to the vendors.
-
Sina Faezi (University of California, Irvine), Sujit Rokka Chhetri (University of California, Irvine), Arnav Vaibhav Malawade (University of California, Irvine), John Charles Chaput (University of California, Irvine), William Grover (University of California, Riverside), Philip Brisk (University of California, Riverside), Mohammad Abdullah Al Faruque (University of California, Irvine)
Synthetic biology is developing into a promising science and engineering field. One of the enabling technologies for this field is the DNA synthesizer. It allows researchers to custom-build sequences of oligonucleotides (short DNA strands) using the nucleobases: Adenine (A), Guanine (G), Cytosine (C), and Thymine (T). Incorporating these sequences into organisms can result in improved disease resistance and lifespan for plants, animals, and humans. Hence, many laboratories spend large amounts of capital researching and developing unique sequences of oligonucleotides. However, these DNA synthesizers are fully automated systems with cyber-domain processes and physical domain components. Hence, they may be prone to security breaches like any other computing system. In our work, we present a novel acoustic side-channel attack methodology which can be used on DNA synthesizers to breach their confidentiality and steal valuable oligonucleotide sequences.
Our proposed attack methodology achieves an average accuracy of 88.07% in predicting each base and is able to reconstruct short sequences with 100% accuracy by making less than 21 guesses out of $4^{15}$ possibilities. We evaluate our attack against the effects of the microphone's distance from the DNA synthesizer and show that our attack methodology can achieve over 80% accuracy when the microphone is placed as far as 0.7 meters from the DNA synthesizer despite the presence of common room noise. In addition, we reconstruct DNA sequences to show how effectively an attacker with biomedical-domain knowledge would be able to derive the intended functionality of the sequence using the proposed attack methodology. To the best of our knowledge, this is the first methodology that highlights the possibility of such an attack on systems used to synthesize DNA molecules. -
Nicolás Rosner (University of California, Santa Barbara), Ismet Burak Kadron (University of California, Santa Barbara), Lucas Bang (Harvey Mudd College), Tevfik Bultan (University of California, Santa Barbara)
We present a black-box, dynamic technique to detect and quantify side-channel information leaks in networked applications that communicate through a TLS-encrypted stream. Given a user-supplied profiling-input suite in which some aspect of the inputs is marked as secret, we run the application over the inputs and capture a collection of variable-length network packet traces. The captured traces give rise to a vast side-channel feature space, including the size and timestamp of each individual packet as well as their aggregations (such as total time, median size, etc.) over every possible subset of packets. Finding the features that leak the most information is a difficult problem.
Our approach addresses this problem in three steps: 1) Global analysis of traces for their alignment and identification of emph{phases} across traces; 2) Feature extraction using the identified phases; 3) Information leakage quantification and ranking of features via estimation of probability distribution.
We embody this approach in a tool called Profit and experimentally evaluate it on a benchmark of applications from the DARPA STAC program, which were developed to assess the effectiveness of side-channel analysis techniques. Our experimental results demonstrate that, given suitable profiling-input suites, Profit is successful in automatically detecting information-leaking features in applications, and correctly ordering the strength of the leakage for differently-leaking variants of the same application.
-
Daimeng Wang (University of California Riverside), Ajaya Neupane (University of California Riverside), Zhiyun Qian (University of California Riverside), Nael Abu-Ghazaleh (University of California Riverside), Srikanth V. Krishnamurthy (University of California Riverside), Edward J. M. Colbert (Virginia Tech), Paul Yu (U.S. Army Research Lab (ARL))
Operating systems use shared memory to improve performance. However, as shown in recent studies, attackers can exploit CPU cache side-channels associated with shared memory to extract sensitive information. The attacks that were previously attempted typically only detect the presence of a certain operation and require significant manual analysis to identify and evaluate their effectiveness. Moreover, very few of them target graphics libraries which are commonly used, but difficult to attack. In this paper, we consider the execution time of shared libraries as the side-channel, and showcase a completely automated technique to discover and select exploitable side-channels on shared graphics libraries. In essence, we first collect the cache lines accessed by a victim process during different key presses offline, and then use machine learning to infer the best cache lines (e.g., easily measurable, robust to noise, high information leakage) for a flush and reload attack. We are able to discover effective strategies to classify what keys have been pressed. Using this approach, we not only preclude the need for manual analyses of code and traces — the automated system discovered many previously unknown side-channels of the type we are interested in, but also achieve high precision in terms of inferring the sensitive information entered on desktop and Android platforms. We show that our approach infers the passwords with lowercase letters and numbers 10,000 - 1,000,000 times faster than random guessing. For a large fraction of PINs consisting of 4 to 6 digits, we are able to infer them within 20 and 80 guesses respectively. Finally, we suggest ways to mitigate these attacks.
-
Jiyong Yu (UIUC), Lucas Hsiung (UIUC), Mohamad El'Hajj (UIUC), Christopher W. Fletcher (UIUC)
Blocking microarchitectural (digital) side channels is one of the most pressing challenges in hardware security today. Recently, there has been a surge of effort that attempts to block these leakages by writing programs data obliviously. In this model, programs are written to avoid placing sensitive data-dependent pressure on shared resources. Despite recent efforts, however, running data oblivious programs on modern machines today is insecure and low performance. First, writing programs obliviously assumes certain instructions in today's ISAs will not leak privacy, whereas today's ISAs and hardware provide no such guarantees. Second, writing programs to avoid data-dependent behavior is inherently high performance overhead.
This paper tackles both the security and performance aspects of this problem by proposing a Data Oblivious ISA extension (OISA). On the security side, we present ISA design principles to block microarchitectural side channels, and embody these ideas in a concrete ISA capable of safely executing existing data oblivious programs. On the performance side, we design the OISA with support for efficient memory oblivious computation, and with safety features that allow modern hardware optimizations, e.g., out-of-order speculative execution, to remain enabled in the common case.
We provide a complete hardware prototype of our ideas, built on top of the RISC-V out-of-order, speculative BOOM processor, and prove that the OISA can provide the advertised security through a formal analysis of an abstract BOOM-style machine. We evaluate area overhead of hardware mechanisms needed to support our prototype, and provide performance experiments showing how the OISA speeds up a variety of existing data oblivious codes (including ``constant time'' cryptography and memory oblivious data structures), in addition to improving their security and portability.
-
Syed Rafiul Hussain (Purdue University), Mitziu Echeverria (University of Iowa), Omar Chowdhury (University of Iowa), Ninghui Li (Purdue University), Elisa Bertino (Purdue University)
The cellular paging (broadcast) protocol strives to
the balance between a cellular device's energy consumption and quality-of-service by allowing the device to *only* periodically poll for pending services in its idle, low-power state. For a given cellular device and serving network, the exact time periods when the device polls for services (called the *paging occasion*) are
fixed by design in the 4G/5G cellular protocol. In this paper, we show that the fixed nature of paging occasions can be exploited by an adversary in the vicinity of a victim to associate the victim's soft-identity (e.g., phone number, Twitter handle) with its paging occasion, with only a modest cost, through an attack dubbed $mathsf{ToRPEDO}$. Consequently, $mathsf{ToRPEDO}$ can enable an adversary to verify a victim's coarse-grained location information, inject fabricated paging messages, and mount denial-of-service attacks. We also demonstrate that, in 4G and 5G, it is plausible for an adversary to retrieve a victim device's persistent identity (i.e., IMSI) with a brute-force $mathsf{IMSI-Cracking}$ attack while using $mathsf{ToRPEDO}$ as an attack sub-step. Our further investigation on 4G paging protocol deployments also identified an *implementation oversight* of several network providers which enables the adversary to launch an attack, named $mathsf{PIERCER}$, for associating a victim's phone number with its IMSI; subsequently allowing targeted user location tracking. All of our attacks have been validated and evaluated in the wild using commodity hardware and software. We finally discuss potential countermeasures against the presented attacks.
-
Alberto Sonnino (University College London (UCL)), Mustafa Al-Bassam (University College London (UCL)), Shehar Bano (University College London (UCL)), Sarah Meiklejohn (University College London (UCL)), George Danezis (University College London (UCL))
Coconut is a novel selective disclosure credential scheme supporting distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. Coconut integrates with Blockchains to ensure confidentiality, authenticity and availability even when a subset of credential issuing authorities are malicious or offline. We implement and evaluate a generic Coconut smart contract library for Chainspace and Ethereum; and present three applications related to anonymous payments, electronic petitions, and distribution of proxies for censorship resistance.
Coconut uses short and computationally efficient credentials, and our evaluation shows that most Coconut cryptographic primitives take just a few milliseconds on average, with verification taking the longest time (10 milliseconds). -
Cormac Herley (Microsoft), Stuart Schechter (Unaffiliated)
Online guessing attacks against password servers can be hard to address. Approaches that throttle or block repeated guesses on an account (e.g., three strikes type lockout rules)
can be effective against depth-first attacks, but are of little help against breadth-first attacks that spread guesses very widely. At large providers with tens or hundreds of millions
of accounts breadth-first attacks offer a way to send millions or even billions of guesses without ever triggering the depth-first defenses.
The absence of labels and non-stationarity of attack traffic make it challenging to apply machine learning techniques.We show how to accurately estimate the odds that an observation $x$ associated with a request is malicious. Our main assumptions are that successful malicious logins are a small
fraction of the total, and that the distribution of $x$ in the legitimate traffic is stationary, or very-slowly varying.
From these we show how we can estimate the ratio of bad-to-good traffic among any set of requests; how we can then identify subsets of the request data that contain least (or even no) attack traffic; how
these least-attacked subsets allow us to estimate the distribution of values of $x$ over the legitimate data, and hence calculate the odds ratio.
A sensitivity analysis shows that even when we fail to identify a subset with little attack traffic our odds ratio estimates are very robust. -
Shridatt Sugrim (Rutgers University), Can Liu (Rutgers University), Meghan McLean (Rutgers University), Janne Lindqvist (Rutgers University)
Research has produced many types of authentication systems that use machine learning. However, there is no consistent approach for reporting performance metrics and the reported metrics are inadequate. In this work, we show that several of the common metrics used for reporting performance, such as maximum accuracy (ACC), equal error rate (EER) and area under the ROC curve (AUROC), are inherently flawed. These common metrics hide the details of the inherent trade-offs a system must make when implemented. Our findings show that current metrics give no insight into how system performance degrades outside the ideal conditions in which they were designed. We argue that adequate performance reporting must be provided to enable meaningful evaluation and that current, commonly used approaches fail in this regard. We present the unnormalized frequency count of scores (FCS) to demonstrate the mathematical underpinnings that lead to these failures and show how they can be avoided. The FCS can be used to augment the performance reporting to enable comparison across systems in a visual way. When reported with the Receiver Operating Characteristics curve (ROC), these two metrics provide a solution to the limitations of currently reported metrics. Finally, we show how to use the FCS and ROC metrics to evaluate and compare different authentication systems.
-
Jaeho Lee (Rice University), Ang Chen (Rice University), Dan S. Wallach (Rice University)
A good security practice for handling sensitive data, such as passwords, is to overwrite the data buffers with zeros once the data is no longer in use. This protects against attackers who gain a snapshot of a device’s physical memory, whether by in- person physical attacks, or by remote attacks like Meltdown and Spectre. This paper looks at unnecessary password retention in Android phones by popular apps, secure password management apps, and even the lockscreen system process. We have performed a comprehensive analysis of the Android framework and a variety of apps, and discovered that passwords can survive in a variety of locations, including UI widgets where users enter their passwords, apps that retain passwords rather than exchange them for tokens, old copies not yet reused by garbage collectors, and buffers in keyboard apps. We have developed solutions that successfully fix these problems with modest code changes.
-
Ke Coby Wang (UNC Chapel Hill), Michael K. Reiter (UNC Chapel Hill)
We present a framework by which websites can coordinate to make it difficult for users to set similar passwords at these websites, in an effort to break the culture of password reuse on the web today.
Though the design of such a framework is fraught with risks to users’ security and privacy, we show that these risks can be effectively mitigated through careful scoping of the goals for such a framework and through principled design. At the core of our framework is a private set-membership-test protocol that enables one website to determine, upon a user setting a password for use at it, whether that user has already set a similar password at another participating website, but with neither side disclosing to the other the password(s) it employs in the protocol. Our framework then layers over this protocol a collection of techniques to mitigate the leakage necessitated by such a test. We verify via probabilistic model checking that these techniques are effective in maintaining account security, and since these mechanisms are consistent with common user experience today, our framework should be unobtrusive to users who do not reuse similar passwords across websites (e.g., due to having adopted a password manager). Through a working implementation of our framework and optimization of its parameters based on insights of how passwords tend to be reused, we show that our design can meet the scalability challenges facing such a service.
-
Cas Cremers (CISPA Helmholtz Center for Information Security), Martin Dehnel-Wild (University of Oxford)
The 5G mobile telephony standards are nearing completion; upon adoption these will be used by billions across the globe. Ensuring the security of 5G communication is of the utmost importance, building trust in a critical component of everyday life and national infrastructure.
We perform a fine-grained formal analysis of 5G’s main authentication and key agreement protocol (5G-AKA), and provide the first models that explicitly consider all parties defined by the protocol specification. Our formal analysis reveals that the security of 5G-AKA critically relies on unstated assumptions on the inner workings of the underlying channels. In practice this means that following the 5G-AKA specification, a provider can easily and ‘correctly’ implement the standard insecurely, leaving the protocol vulnerable to a security-critical race condition. We then provide the first models and analysis considering component and channel compromise in 5G, the results of which further demonstrate the fragility and subtle trust assumptions of the 5G-AKA protocol.
We propose formally verified fixes to the encountered issues, and we have worked with 3GPP to ensure that these fixes are adopted.
-
Mridula Singh (ETH Zurich, Switzerland), Patrick Leu (ETH Zurich, Switzerland), Srdjan Capkun (ETH Zurich, Switzerland)
Physical-layer attacks allow attackers to manipulate (spoof) ranging and positioning. These attacks had real-world impact and allowed car thefts, executions of unauthorized payments and manipulation of navigation. UWB impulse radio, standardized within 802.15.4a,f, has emerged as a prominent technique for precise ranging that allows high operating distances despite power constraints by transmitting multi-pulse symbols. Security of UWB ranging (in terms of the attacker's ability to manipulate the measured distance) has been discussed in the literature and is, since recently also being addressed as a part of the emerging 802.15.4z standard. However, all research so far, as well as security enhancements proposed within this emerging standard face one main limitation: they achieve security through short symbol lengths and sacrifice performance (i.e., limit the maximum distance of measurement), or use longer symbol lengths, therefore sacrificing security. We present UWB with pulse reordering (UWB-PR), the first modulation scheme that secures distance measurement between two mutually trusted devices against all physical-layer distance shortening attacks without sacrificing performance, therefore simultaneously enabling extended range and security. We analyze the security of UWB-PR under the attacker that fully controls the communication channel and show that UWB-PR resists such strong attackers. We evaluate UWB-PR within a UWB system built on top of the IEEE 802.15.4 device and show that it achieves distances of up to 93m with 10cm precision (LoS). UWB-PR is, therefore, a good candidate for the extended mode of the new 802.15.4z Low Rate Pulse standard. Finally, UWB-PR shows that secure distance measurement can be built on top of modulation schemes with longer symbol lengths - so far, this was considered insecure.
-
Daniele Antonioli (Singapore University of Technology and Design (SUTD)), Nils Ole Tippenhauer (CISPA), Kasper Rasmussen (University of Oxford)
Google’s Nearby Connections API enables any Android (and Android Things) application to provide proximity-based services to its users, regardless of their network connectivity. The API uses Bluetooth BR/EDR, Bluetooth LE and Wi-Fi to let “nearby” clients (discoverers) and servers (advertisers) connect and exchange different types of payloads. The implementation of the API is proprietary, closed-source and obfuscated. The updates of the API are automatically installed by Google across different versions of Android, without user interaction. Little is known publicly about the security guarantees offered by the API, even though it presents a significant attack surface.
In this work we present the first security analysis of the Google’s Nearby Connections API, based on reverse-engineering of its Android implementation. We discover and implement several attacks grouped into two families: connection manipulation (CMA) and range extension attacks (REA). CMA-attacks allow an attacker to insert himself as a man-in-the-middle and manipulate connections (even unrelated to the API), and to tamper with the victim’s network interface and configuration. REA-attacks allow an attacker to tunnel any nearby connection to remote (non-nearby) locations, even between two honest devices. Our attacks are enabled by REarby, a toolkit we developed while reversing the implementation of the API. REarby includes a dynamic binary instrumenter, a packet dissector, and the implementations of custom Nearby Connections client and server.
-
Fenghao Xu (The Chinese University of Hong Kong), Wenrui Diao (Jinan University), Zhou Li (University of California, Irvine), Jiongyi Chen (The Chinese University of Hong Kong), Kehuan Zhang (The Chinese University of Hong Kong)
Bluetooth is a widely used communication technology, especially under the scenarios of mobile computing and Internet of Things. Once paired with a host device, a Bluetooth device then can exchange commands and data, such as voice, keyboard/mouse inputs, network, blood pressure data, and so on, with the host. Due to the sensitivity of such data and commands, some security measures have already been built into the Bluetooth protocol, like authentication, encryption, authorization, etc.
However, according to our studies on the Bluetooth protocol as well as its implementation on Android system, we find that there are still some design flaws which could lead to serious security consequences. For example, it is found that the authentication process on Bluetooth profiles is quite inconsistent and coarse-grained: if a paired device changes its profile, it automatically gets trust and users would not be notified. Also, there is no strict verification on the information provided by the Bluetooth device itself, so that a malicious device can deceive a user by changing its name, profile information, and icon to be displayed on the screen.
To better understand the problem, we performed a systematic study over the Bluetooth profiles and presented three attacks to demonstrate the feasibility and potential damages of such Bluetooth design flaws. The attacks were implemented on a Raspberry Pi 2 device and evaluated with different Android OS versions ranging from 5.1 to the latest 8.1. The results showed adversaries could bypass existing protections of Android (e.g., permissions, isolations, etc.), launch Man-in-the-Middle attack, control the victim apps and system, steal sensitive information, etc. To mitigate such threats, a new Bluetooth validation mechanism was proposed. We implemented the prototype system based on the AOSP project and deployed it on a Google Pixel 2 phone for evaluation. The experiment showed our solution could effectively prevent the attacks.
-
Daoyuan Wu (Singapore Management University), Debin Gao (Singapore Management University), Rocky K. C. Chang (The Hong Kong Polytechnic University), En He (China Electronic Technology Cyber Security Co., Ltd.), Eric K. T. Cheng (The Hong Kong Polytechnic University), Robert H. Deng (Singapore Management University)
Open TCP/UDP ports are traditionally used by servers to provide application services, but they are also found in many Android apps. In this paper, we present the first open-port analysis pipeline, covering the discovery, diagnosis, and security assessment, to systematically understand open ports in Android apps and their threats. We design and deploy a novel on-device crowdsourcing app and its server-side analytic engine to continuously monitor open ports in the wild. Over a period of ten months, we have collected over 40 million port monitoring records from 3,293 users in 136 countries worldwide, which allow us to observe the actual execution of open ports in 925 popular apps and 725 built-in system apps. The crowdsourcing also provides us a more accurate view of the pervasiveness of open ports in Android apps at 15.3%, much higher than the previous estimation of 6.8%. We also develop a new static diagnostic tool to reveal that 61.8% of the open-port apps are solely due to embedded SDKs, and 20.7% suffer from insecure API usages. Finally, we perform three security assessments of open ports: (i) vulnerability analysis revealing five vulnerability patterns in open ports of popular apps, e.g., Instagram, Samsung Gear, Skype, and the widely-embedded Facebook SDK, (ii) inter-device connectivity measurement in 224 cellular networks and 2,181 WiFi networks through crowdsourced network scans, and (iii) experimental demonstration of effective denial-of-service attacks against mobile open ports.
-
Z. Berkay Celik (Penn State University), Gang Tan (Penn State University), Patrick McDaniel (Penn State University)
Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital connectivity has changed the way we live, play, and work. To date, the traditional approach to securing IoT has treated devices individually. However, in practice, it has been recently shown that the interactions among devices are often the real cause of safety and security violations. In this paper, we present IoTGuard, a dynamic, policy-based enforcement system for IoT, which protects users from unsafe and insecure device states by monitoring the behavior of IoT and trigger-action platform apps. IoTGuard operates in three phases: (a) implementation of a code instrumentor that adds extra logic to an app's source code to collect app's information at runtime, (b) storing the apps' information in a dynamic model that represents the runtime execution behavior of apps, and (c) identifying IoT safety and security policies, and enforcing relevant policies on the dynamic model of individual apps or sets of interacting apps. We demonstrate IoTGuard on 20 flawed apps and find that IoTGuard correctly enforces 12 of the 12 policy violations. In addition, we evaluate IoTGuard on 35 SmartThings IoT and 30 IFTTT trigger-action platform market apps executed in a simulated smart home. IoTGuard enforces 11 unique policies and blocks 16 states in six (17.1%) SmartThings and five (16.6%) IFTTT apps. IoTGuard imposes only 17.3% runtime overhead on an app and 19.8% for five interacting apps. Through this effort, we introduce a rigorously grounded system for enforcing correct operation of IoT devices through systematically identified IoT policies, demonstrating the effectiveness and value of monitoring IoT apps with tools such as IoTGuard.
-
Tohid Shekari (ECE, Georgia Tech), Christian Bayens (ECE, Georgia Tech), Morris Cohen (ECE, Georgia Tech), Lukas Graber (ECE, Georgia Tech), Raheem Beyah (ECE, Georgia Tech)
Recently, the number of cyber threats on power systems has increased at an unprecedented rate. For instance, the widespread blackout in Ukrainian power grid on December 2015 was a wakeup call that modern power systems have numerous vulnerabilities, especially in power substations which form the backbone of electricity networks. There have been significant efforts among researchers to develop effective intrusion detection systems (IDSs) in order to prevent such attacks or at least reduce their damaging consequences. However, all of the existing techniques require some level of trust from components on the supervisory control and data acquisition (SCADA) network; hence, they are still vulnerable to sophisticated attacks that can compromise the SCADA system completely. This paper presents a radio frequency-based distributed intrusion detection system (RFDIDS) which remains reliable even when the entire SCADA system is considered untrusted. The proposed system uses radio frequency (RF) emissions to monitor the power grid substation activities. Indeed, it utilizes a radio receiver as a diagnostic tool to provide air-gapped, independent, and verifiable information about the radio emissions from substation components, particularly at low frequencies (LF, 0.05$-$50~kHz, or $>$20~$mu$s period). The simulation and experimental results verified that four types of diagnostic information can be extracted from radio emissions of power system substation circuits: i)~harmonic content of the circuit current, ii)~fundamental frequency of the circuit current, iii)~impulsive signals from rapid circuit current changes, and iv)~sferics from global lightning strokes. Each or a combination of the first three diagnostics can be effectively leveraged to directly detect specific types of power grid attacks. Meanwhile, the last diagnostic is utilized to check the integrity of the receiver's signal as it is encoded with the quasi-random distribution of the global lightning strokes. The simulation and real-world experimental results verified the effectiveness of RFDIDS in protecting the power grid against sophisticated attacks.
-
Cheng Feng (Imperial College London & Siemens Corporate Technology), Venkata Reddy Palleti (Singapore University of Technology and Design), Aditya Mathur (Singapore University of Technology and Design), Deeph Chana (Imperial College London)
Industrial Control Systems (ICS) consisting of integrated hardware and software components designed to monitor and control a variety of industrial processes, are typically deployed in critical infrastructures such as water treatment plants, power grids and gas pipelines. Unlike conventional IT systems, the consequences of deviations from normal operation in ICS have the potential to cause significant physical damage to equipment, the environment and even human life. The active monitoring of invariant rules that define the physical conditions that must be maintained for the normal operation of ICS provides a means to improve the security and dependability of such systems by which early detection of anomalous system states may be achieved, allowing for timely mitigating actions -- such as fault checking, system shutdown -- to be taken. Generally, invariant rules are pre-defined by system engineers during the design phase of a given ICS build. However, this manually intensive process is costly, error-prone and, in typically complex systems, sub-optimal. In this paper we propose a novel framework that is designed to systematically generate invariant rules from information contained within ICS operational data logs, using a combination of several machine learning and data mining techniques. The effectiveness of our approach is demonstrated by experiments on two real world ICS testbeds: a water distribution system and a water treatment plant. We show that sets of invariant rules, far larger than those defined manually, can be successfully derived by our framework and that they may be used to deliver significant improvements in anomaly detection compared with the invariant rules defined by system engineers as well as the commonly used residual error-based anomaly detection model for ICS.
-
Tigist Abera (Technische Universität Darmstadt), Raad Bahmani (Technische Universität Darmstadt), Ferdinand Brasser (Technische Universität Darmstadt), Ahmad Ibrahim (Technische Universität Darmstadt), Ahmad-Reza Sadeghi (Technische Universität Darmstadt), Matthias Schunter (Intel Labs)
Networks of autonomous collaborative embedded systems are emerging in many application domains such as vehicular ad-hoc networks, robotic factory workers, search/rescue robots, delivery and search drones. To perform their collaborative tasks the involved devices exchange various types of information such as sensor data, status information, and commands. For the correct operation of these complex systems each device must be able to verify that the data coming from other devices is correct and has not been maliciously altered.
In this paper, we present DIAT – a novel approach that allows to verify the correctness of data by attesting the correct generation as well as processing of data using control-flow attestation. DIAT enables devices in autonomous collaborative networks to securely and efficiently interact, relying on a minimal TCB. It ensures that the data sent from one device to another device is not maliciously changed, neither during transport nor during generation or processing on the originating device. Data exchanged between devices in the network is therefore authenticated along with a proof of integrity of all software involved in its generation and processing. To enable this, the embedded devices’ software is decomposed into simple interacting modules reducing the amount and complexity of software that needs to be attested, i.e., only those modules that process the data are relevant. As proof-of-concept we implemented and evaluated our scheme DIAT on a state-of-the-art flight controller for drones. Furthermore, we evaluated our scheme in a simulation environment to demonstrate its scalability for large-scale systems.
-
Anastasis Keliris (NYU), Michail Maniatakos (NYU Abu Dhabi)
The security of Industrial Control Systems (ICS) has been attracting increased attention over the past years, following the discovery of real threats targeting industrial environments. Despite this attention, automation of the reverse engineering process of ICS binaries for programmable logic controllers remains an open problem, mainly due to the use of proprietary compilers by ICS vendors. Such automation could be a double-edged sword; on the one hand it could accelerate digital forensic investigations and incident response actions, while on the other hand it could enable dynamic generation of malicious ICS payloads. In this work, we propose a structured methodology that automates the reverse engineering process for ICS binaries taking into account their unique domain-specific characteristics. We apply this methodology to develop the modular Industrial Control Systems Reverse Engineering Framework (ICSREF), and instantiate ICSREF modules for reversing binaries compiled with CODESYS, a widely used software stack and compiler for PLCs. To evaluate our framework we create a database of samples by collecting real PLC binaries from public code repositories, as well as developing binaries in-house. Our results demonstrate that ICSREF can successfully handle diverse PLC binaries from varied industry sectors, irrespective of the programming language used. Furthermore, we deploy ICSREF on a commercial smartphone which orchestrates and launches a completely automated process-aware attack against a chemical process testbed. This example of dynamic payload generation showcases how ICSREF can enable sophisticated attacks without any prior knowledge.
-
Kimia Tajik (Oregon State University), Akshith Gunasekaran (Oregon State University), Rhea Dutta (Cornell University), Brandon Ellis (Oregon State University), Rakesh B. Bobba (Oregon State University), Mike Rosulek (Oregon State University), Charles V. Wright (Portland State University), Wu-Chi Feng (Portland State University)
In this paper, we motivate the need for image encryption techniques that preserve certain visual features in images and hide all other information, to balance privacy and usability in the context of cloud-based image storage services. In particular, we introduce the concept of ideal or exact Thumbnail-Preserving Encryption (TPE), a special case of format-preserving encryption, and present a concrete construction. In TPE, a ciphertext is itself an image that has the same thumbnail as the plaintext (unecrypted) image, but that provably leaks nothing about the plaintext beyond its thumbnail. We provide a formal security analysis for the construction, and a prototype implementation to demonstrate compatibility with existing services. We also study the ability of users to distinguish between thumbnail images preserved by TPE. Our findings indicate that TPE is an efficient and promising approach to balance usability and privacy concerns for images. Our code and a demo are available at: http://photoencryption.org.
-
Anrin Chakraborti (Stony Brook University), Radu Sion (Stony Brook University)
ConcurORAM is a parallel, multi-client oblivious RAM (ORAM) that eliminates waiting for concurrent stateless clients and allows over-all throughput to scale gracefully, without requiring trusted third party components (proxies) or direct inter-client coordination. A key insight behind ConcurORAM is the fact that, during multi-client data access, only a subset of the concurrently-accessed server-hosted data structures require access privacy guarantees. Everything else can be safely implemented as oblivious data structures that are later synced securely and efficiently during an ORAM “eviction”.
Further, since a major contributor to latency is the eviction– in which client-resident data is reshuffled and reinserted back encrypted into the main server database – ConcurORAM also enables multiple concurrent clients to evict asynchronously, in parallel (without compromising consistency), and in the back-ground without having to block ongoing queries. As a result, throughput scales well with increasing number of concurrent clients and is not significantly impacted by evictions. For example, about 65 queries per second can be executed in parallel by 30 concurrent clients, a 2x speedup over the state-of-the-art. The query access time for individual clients increases by only 2x when compared to a single-client deployment.
-
Xiaokuan Zhang (The Ohio State University), Jihun Hamm (The Ohio State University), Michael K. Reiter (University of North Carolina at Chapel Hill), Yinqian Zhang (The Ohio State University)
Machine learning empowers traffic-analysis attacks that breach users' privacy from their encrypted traffic. Recent advances in deep learning drastically escalate such threats.
One prominent example demonstrated recently is a traffic-analysis attack against video streaming by using convolutional neural networks. In this paper, we explore the adaption of techniques previously used in the domains of adversarial machine learning and differential privacy to mitigate the machine-learning-powered analysis of streaming traffic.Our findings are twofold. First, constructing adversarial samples effectively confounds an adversary with a predetermined classifier but is less effective when the adversary can adapt to the defense by using alternative classifiers or training the classifier with adversarial samples. Second, differential-privacy guarantees are very effective against such statistical-inference-based traffic analysis, while remaining agnostic to the machine learning classifiers used by the adversary. We propose two mechanisms for enforcing differential privacy for encrypted streaming traffic, and evaluate their security and utility. Our empirical implementation and evaluation suggest that the proposed statistical privacy approaches are promising solutions in the underlying scenarios.
-
Anrin Chakraborti (Stony Brook University), Adam J. Aviv (United States Naval Academy), Seung Geol Choi (United States Naval Academy), Travis Mayberry (United States Naval Academy), Daniel S. Roche (United States Naval Academy), Radu Sion (Stony Brook University)
Oblivious RAM protocols (ORAMs) allow a client to access data from an untrusted storage device without revealing to that device any information about their access pattern. Typically this is accomplished through random shuffling of the data such that the storage device cannot determine where individual blocks are located, resulting in a highly randomized access pattern. Storage devices however, are typically optimized for sequential access. A large number of random disk seeks during standard ORAM operation induce a substantial overhead.
In this paper, we introduce rORAM, an ORAM specifically suited for accessing ranges of sequentially logical blocks while minimizing the number of random physical disk seeks. rORAM obtains significantly better asymptotic efficiency than prior designs (Asharov et al., ePrint 2017, Demertzis et al., CRYPTO 2018) reducing both the number of seeks and communication complexity by a multiplicative factor of $mathbb{O}(log N)$. An rORAM prototype is 30-50x times faster than Path ORAM for similar range-query workloads on local HDDs, 30x faster for local SSDs, and 10x faster for network block devices. rORAM's novel disk layout can also speed up standard ORAM constructions, e.g., resulting in a 2x faster Path ORAM variant. Importantly, experiments demonstrate suitability for real world applications -- rORAM is up to 5x faster running a file server and up to 11x faster running a range-query intensive video server workloads compared to standard Path ORAM.
-
Victor Perrier (Data61, CSIRO and ISAE-SUPAERO), Hassan Jameel Asghar (Macquarie University and Data61, CSIRO), Dali Kaafar (Macquarie University and Data61, CSIRO)
We present a differentially private mechanism to display statistics (e.g., the moving average) of a stream of real valued observations where the bound on each observation is either too conservative or unknown in advance. This is particularly relevant to scenarios of real-time data monitoring and reporting, e.g., energy data through smart meters. Our focus is on real-world data streams whose distribution is light-tailed, meaning that the tail approaches zero at least as fast as the exponential distribution. For such data streams, individual observations are expected to be concentrated below an unknown threshold. Estimating this threshold from the data can potentially violate privacy as it would reveal particular events tied to individuals. On the other hand an overly conservative threshold may impact accuracy by adding more noise than necessary. We construct a utility optimizing differentially private mechanism to release this threshold based on the input stream. Our main advantage over the state-of-the-art algorithms is that the resulting noise added to each observation of the stream is scaled to the threshold instead of a possibly much larger bound; resulting in considerable gain in utility when the difference is significant. Using two real-world datasets, we demonstrate that our mechanism, on average, improves the utility by a factor of 3.5 on the first dataset, and 9 on the other. While our main focus is on continual release of statistics, our mechanism for releasing the threshold can be used in various other applications where a (privacy-preserving) measure of the scale of the input distribution is required.
Wednesday, 27 February
-
Hadi Abdullah (University of Florida), Washington Garcia (University of Florida), Christian Peeters (University of Florida), Patrick Traynor (University of Florida), Kevin R. B. Butler (University of Florida), Joseph Wilson (University of Florida)
Voice Processing Systems (VPSes) are becoming an increasingly popular interface. Such interfaces have been made significantly more accurate through the application of recent advances in machine learning. Adversarial machine learning has similarly advanced and has been used to demonstrate that such systems are vulnerable to the injection of hidden commands - audio obscured by noise that is correctly recognized by a VPS but not by human beings. However, such attacks are often highly dependent on white-box knowledge of a specific machine learning model and limited to specific microphones and speakers, making their making their use across different acoustic hardware platforms (and practicality) limited. In this paper, we break such dependencies and make hidden command attacks more prac- tical through model-agnostic (black-box) attacks, which exploit knowledge of the signal processing algorithms commonly used by VPSes to generate the data fed into machine learning systems. Specifically, we exploit the fact that multiple source audio samples have similar feature vectors when transformed by acoustic feature extraction algorithms (e.g., FFTs). We develop four classes of perturbations that create unintelligible audio and test them against 12 machine learning models, that include 7 proprietary models (e.g., Google Speech API, Bing Speech API, IBM Speech API, Azure Speaker API, etc), and demonstrate successful attacks against all targets. Moreover, we successfully use our maliciously generated audio samples in multiple hardware configurations, demonstrating effectiveness across both models and real systems. In so doing, we demonstrate that domain-specific knowledge of audio signal processing represents a practical means of generating successful hidden voice command attacks.
-
Lea Schönherr (Ruhr University Bochum), Katharina Kohls (Ruhr University Bochum), Steffen Zeiler (Ruhr University Bochum), Thorsten Holz (Ruhr University Bochum), Dorothea Kolossa (Ruhr University Bochum)
Voice interfaces are becoming accepted widely as input methods for a diverse set of devices. This development is driven by rapid improvements in automatic speech recognition (ASR), which now performs on par with human listening in many tasks. These improvements base on an ongoing evolution of deep neural networks (DNNs) as the computational core of ASR. However, recent research results show that DNNs are vulnerable to adversarial perturbations, which allow attackers to force the transcription into a malicious output.
In this paper, we introduce a new type of adversarial examples based on *psychoacoustic hiding*. Our attack exploits the characteristics of DNN-based ASR systems, where we extend the original analysis procedure by an additional backpropagation step. We use this backpropagation to learn the degrees of freedom for the adversarial perturbation of the input signal, i.e., we apply a psychoacoustic model and manipulate the acoustic signal below the thresholds of human perception. To further minimize perceptibility of the perturbations, we use forced alignment to find the best fitting temporal alignment between the original audio sample and the malicious target transcription. These extensions allow us to embed an arbitrary audio input with a malicious voice command that is then transcribed by the ASR system, with the audio signal remaining barely distinguishable from the original signal. In an experimental evaluation, we attack the state-of-the-art speech recognition system *Kaldi* and determine the best performing parameter and analysis setup for different types of input. Our results show that we are successful in up to 98% of cases with a computational effort of fewer than two minutes for a ten-second audio file. Based on user studies, we found that none of our target transcriptions were audible to human listeners, who still understand the original speech content with unchanged accuracy.
-
Ajaya Neupane (University of California Riverside), Nitesh Saxena (University of Alabama at Birmingham), Leanne Hirshfield (Syracuse University), Sarah Elaine Bratt (Syracuse University)
A new generation of scams has emerged that uses voice impersonation to obtain sensitive information, eavesdrop over voice calls and extort money from unsuspecting human users. Research demonstrates that users are fallible to voice impersonation attacks that exploit the current advancement in speech synthesis. In this paper, we set out to elicit a deeper understanding of such human-centered “voice hacking” based on a neuro-scientific methodology (thereby corroborating and expanding the traditional behavioral-only approach in significant ways). Specifically, we investigate the *neural underpinnings* of voice security through *functional near-infrared spectroscopy* (fNIRS), a cutting-edge neuroimaging technique, that captures neural signals in both temporal and spatial domains. We design and conduct an fNIRS study to pursue a thorough investigation of users’ mental processing related to *speaker legitimacy detection* – whether a voice sample is rendered by a target speaker, a different other human speaker or a synthesizer mimicking the speaker. We analyze the neural activity associated within this task as well as the brain areas that may control such activity.
Our key insight is that there may be no statistically significant differences in the way the human brain processes the *legitimate speakers vs. synthesized speakers*, whereas clear differences are visible when encountering *legitimate vs. different other human speakers*. This finding may help to explain users’ susceptibility to synthesized attacks, as seen from the behavioral self-reported analysis. That is, the impersonated synthesized voices may seem *indistinguishable* from the real voices in terms of both behavioral and neural perspectives. In sharp contrast, prior studies showed *subconscious* neural differences in other real vs. fake artifacts (e.g., paintings and websites), despite users failing to note these differences behaviorally. Overall, our work dissects the fundamental neural patterns underlying voice-based insecurity and reveals users’ susceptibility to voice synthesis attacks at a biological level. We believe that this could be a significant insight for the security community suggesting that the human detection of voice synthesis attacks may not improve over time, especially given that voice synthesis techniques will likely continue to improve, calling for the design of careful machine-assisted techniques to help humans counter these attacks.
-
Yangyong Zhang (Texas A&M University), Lei Xu (Texas A&M University), Abner Mendoza (Texas A&M University), Guangliang Yang (Texas A&M University), Phakpoom Chinprutthiwong (Texas A&M University), Guofei Gu (Texas A&M University)
Popular Voice Assistant (VA) services such as Amazon Alexa and Google Assistant are now rapidly appifying their platforms to allow more flexible and diverse voice-controlled service experience. However, the ubiquitous deployment of VA devices and the increasing number of third-party applications have raised security and privacy concerns. While previous works such as hidden voice attacks mostly examine the problems of VA services’ default Automatic Speech Recognition (ASR)
component, our work analyzes and evaluates the security of the succeeding component after ASR, i.e., Natural Language Understanding (NLU), which performs semantic interpretation (i.e., text-to-intent) after ASR’s acoustic-to-text processing. In particular, we focus on NLU’s Intent Classifier which is used in customizing machine understanding for third-party VA Applications (or vApps). We find that the semantic inconsistency caused by the improper semantic interpretation of an Intent Classifier can create the opportunity of breaching the integrity of vApp processing when attackers delicately leverage some common spoken errors.In this paper, we design the first linguistic-model-guided fuzzing tool, named LipFuzzer, to assess the security of Intent Classifier and systematically discover potential misinterpretation-prone spoken errors based on vApps’ voice command templates. To guide the fuzzing, we construct adversarial linguistic models with the help of Statistical Relational Learning (SRL) and emerging Natural Language Processing (NLP) techniques. In evaluation, we have successfully verified the effectiveness and accuracy of LipFuzzer. We also use LipFuzzer to evaluate both Amazon Alexa and Google Assistant vApp platforms. We have identified that a large portion of real-world vApps
are vulnerable based on our fuzzing result.
-
Seunghyeon Lee (KAIST, S2W LAB Inc.), Changhoon Yoon (S2W LAB Inc.), Heedo Kang (KAIST), Yeonkeun Kim (KAIST), Yongdae Kim (KAIST), Dongsu Han (KAIST), Sooel Son (KAIST), Seungwon Shin (KAIST, S2W LAB Inc.)
The Dark Web is notorious for being a major distribution channel of harmful content as well as unlawful goods. Perpetrators have also used cryptocurrencies to conduct illicit financial transactions while hiding their identities. The limited coverage and outdated data of the Dark Web in previous studies motivated us to conduct an in-depth investigative study to understand how perpetrators abuse cryptocurrencies in the Dark Web. We designed and implemented MFScope, a new framework which collects Dark Web data, extracts cryptocurrency information, and analyzes their usage characteristics on the Dark Web. Specifically, MFScope collected more than 27 million dark webpages and extracted around 10 million unique cryptocurrency addresses for Bitcoin, Ethereum, and Monero. It then classified their usages to identify trades of illicit goods and traced cryptocurrency money flows, to reveal black money operations on the Dark Web. In total, using MFScope we discovered that more than 80% of Bitcoin addresses on the Dark Web were used with malicious intent; their monetary volume was around 180 million USD, and they sent a large sum of their money to several popular cryptocurrency services (e.g., exchange services). Furthermore, we present two real-world unlawful services and demonstrate their Bitcoin transaction traces, which helps in understanding their marketing strategy as well as black money operations.
-
Derek Leung (MIT CSAIL), Adam Suhl (MIT CSAIL), Yossi Gilad (MIT CSAIL), Nickolai Zeldovich (MIT CSAIL)
Decentralized cryptocurrencies rely on participants to keep track of the state of the system in order to verify new transactions. As the number of users and transactions grows, this requirement becomes a significant burden, requiring users to download, verify, and store a large amount of data to participate.
Vault is a new cryptocurrency design based on Algorand that minimizes these storage and bootstrapping costs for participants. Vault’s design is based on Algorand’s proof-of-stake consensus protocol and uses several techniques to achieve its goals. First, Vault decouples the storage of recent transactions from the storage of account balances, which enables Vault to delete old account state. Second, Vault allows sharding state across participants in a way that preserves strong security guarantees. Finally, Vault introduces the notion of stamping certificates, which allow a new client to catch up securely and efficiently in a proof-of-stake system without having to verify every single block.
Experiments with a prototype implementation of Vault’s data structures show that Vault’s design reduces the bandwidth cost of joining the network as a full client by 99.7% compared to Bitcoin and 90.5% compared to Ethereum when downloading a ledger containing 500 million transactions.
-
Michael Rodler (University of Duisburg-Essen), Wenting Li (NEC Laboratories, Germany), Ghassan O. Karame (NEC Laboratories, Germany), Lucas Davi (University of Duisburg-Essen)
Recently, a number of existing blockchain systems have witnessed major bugs and vulnerabilities within smart contracts. Although the literature features a number of proposals for securing smart contracts, these proposals mostly focus on proving the correctness or absence of a certain type of vulnerability within a contract, but cannot protect deployed (legacy) contracts from being exploited.
In this paper, we address this problem in the context of re-entrancy exploits and propose a novel smart contract security technology, dubbed Sereum (Secure Ethereum), which protects existing, deployed contracts against re-entrancy attacks in a backwards compatible way based on run-time monitoring and validation. Sereum does neither require any modification nor any semantic knowledge of existing contracts. By means of implementation and evaluation using the Ethereum blockchain, we show that Sereum covers the actual execution flow of a smart contract to accurately detect and prevent
attacks with a false positive rate as small as 0.06% and with negligible
run-time overhead. As a by-product, we develop three advanced re-entrancy attacks to demonstrate the limitations of existing offline vulnerability analysis tools. -
Giulio Malavolta (Friedrich-Alexander University Erlangen-Nürnberg), Pedro Moreno Sanchez (TU Wien), Clara Schneidewind (TU Wien), Aniket Kate (Purdue University), Matteo Maffei (TU Wien)
Tremendous growth in cryptocurrency usage is exposing the inherent scalability issues with permissionless blockchain technology. Payment-channel networks (PCNs) have emerged as the most widely deployed solution to mitigate the scalability issues, allowing the bulk of payments between two users to be carried out off-chain. Unfortunately, as reported in the literature and further demonstrated in this paper, current PCNs do not provide meaningful security and privacy guarantees [30], [40].
In this work, we study and design secure and privacy-preserving PCNs. We start with a security analysis of existing PCNs, reporting a new attack that applies to all major PCNs, including the Lightning Network, and allows an attacker to steal the fees from honest intermediaries in the same payment path. We then formally define anonymous multi-hop locks (AMHLs), a novel cryptographic primitive that serves as a cornerstone for the design of secure and privacy-preserving PCNs. We present several provably secure cryptographic instantiations that make AMHLs compatible with the vast majority of cryptocurrencies. In particular, we show that (linear) homomorphic one-way functions suffice to construct AMHLs for PCNs supporting a script language (e.g., Ethereum). We also propose a construction based on ECDSA signatures that does not require scripts, thus solving a prominent open problem in the field.
AMHLs constitute a generic primitive whose usefulness goes beyond multi-hop payments in a single PCN and we show how to realize atomic swaps and interoperable PCNs from this primitive. Finally, our performance evaluation on a commodity machine finds that AMHL operations can be performed in less than 100 milliseconds and require less than 500 bytes of communication overhead, even in the worst case. In fact, after acknowledging our attack, the Lightning Network developers have implemented our ECDSA-based AMHLs into their PCN. This demonstrates the practicality of our approach and its impact on the security, privacy, interoperability, and scalability of today’s cryptocurrencies. -
Xiaofei Bai (School of Computer Science, Fudan University), Jian Gao (School of Computer Science, Fudan University), Chenglong Hu (School of Computer Science, Fudan University), Liang Zhang (School of Computer Science, Fudan University)
Blockchain networks, especially cryptocurrencies, rely heavily on proof-of-work (PoW) systems, often as a basis to distribute rewards. These systems require solving specific puzzles, where Application Specific Integrated Circuits (ASICs) can be designed for performance or efficiency. Either way, ASICs surpass CPUs and GPUs by orders of magnitude, and may harm blockchain networks. Recently, Equihash is developed to resist ASIC solving with heavy memory usage. Although commercial ASIC solvers exist for its most popular parameter set, such solvers do not work under better ones, and are considered impossible under optimal parameters. In this paper, we inspect the ASIC resistance of Equihash by constructing a parameter-independent adversary solver design. We evaluate the product, and project at least 10x efficiency advantage for resourceful adversaries. We contribute to the security community in two ways: (1) by revealing the limitation of Equihash and raising awareness about its algorithmic factors, and (2) by demonstrating that security inspection is practical and useful on PoW systems, serving as a start point for future research and development.
-
Adil Ahmad (Purdue), Byunggill Joe (KAIST), Yuan Xiao (Ohio State University), Yinqian Zhang (Ohio State University), Insik Shin (KAIST), Byoungyoung Lee (Purdue/SNU)
Program obfuscation is a popular cryptographic construct with a wide range of uses such as IP theft prevention. Although cryptographic solutions for program obfuscation impose impractically high overheads, a recent breakthrough leveraging trusted hardware has shown promise. However, the existing solution is based on special-purpose trusted hardware, restricting its use-cases to a limited few.
In this paper, we first study if such obfuscation is feasible based on commodity trusted hardware, Intel SGX, and we observe that certain important security considerations are not afforded by commodity hardware. In particular, we found that existing obfuscation/obliviousness schemes are insecure if directly applied to Intel SGX primarily due to side-channel limitations. To this end, we present OBFUSCURO, the first system providing program obfuscation using commodity trusted hardware, Intel SGX. The key idea is to leverage ORAM operations to perform secure code execution and data access. Initially, OBFUSCURO transforms the regular program layout into a side-channel-secure and ORAM-compatible layout. Then, OBFUSCURO ensures that its ORAM controller performs data oblivious accesses in order to protect itself from all memory-based side-channels. Furthermore, OBFUSCURO ensures that the program is secure from timing attacks by ensuring that the program always runs for a pre-configured time interval. Along the way, OBFUSCURO also introduces a systematic optimization such as register-based ORAM stash. We provide a thorough security analysis of OBFUSCURO along with empirical attack evaluations showing that OBFUSCURO can protect the SGX program execution from being leaked by access pattern-based and timing-based channels. We also provide a detailed performance benchmark results in order to show the practical aspects of OBFUSCURO.
-
Lianying Zhao (Concordia University), Mohammad Mannan (Concordia University)
Unauthorized data alteration has been a long-standing threat since the emergence of malware. System and application software can be reinstalled and hardware can be replaced, but user data is priceless in many cases. Especially in recent years, ransomware has become high-impact due to its direct monetization model. State-of-the-art defenses are mostly based on known signature or behavior analysis, and more importantly, require an uncompromised OS kernel. However, malware with the highest software privileges has shown its obvious existence.
We propose to move from current detection/recovery based mechanisms to data loss prevention, where the focus is on armoring data instead of counteracting malware. Our solution,
Inuksuk, relies on today’s Trusted Execution Environments (TEEs), as available both on the CPU and storage device, to achieve programmable write protection. We back up a copy of user-selected files as write-protected at all times, and subsequent updates are written as new versions securely through TEE. We implement Inuksuk on Windows 7 and 10, and Linux (Ubuntu); our core design is OS and application agnostic, and incurs no run-time performance penalty for applications. File transfer disruption can be eliminated or alleviated through access modes and customizable update policies (e.g., interval, granularity). For Inuksuk’s adoptability in modern OSes, we have also ported Flicker (EuroSys 2008), a defacto standard tool for in-OS privileged TEE management, to the latest 64-bit Windows. -
Samuel Weiser (Graz University of Technology), Mario Werner (Graz University of Technology), Ferdinand Brasser (Technische Universität Darmstadt), Maja Malenko (Graz University of Technology), Stefan Mangard (Graz University of Technology), Ahmad-Reza Sadeghi (Technische Universität Darmstadt)
Embedded computing devices are used on a large scale in the emerging internet of things (IoT). However, their wide deployment raises the incentive for attackers to target these devices, as demonstrated by several recent attacks. As IoT devices are built for long service life, means are required to protect sensitive code in the presence of potential vulnerabilities, which might be discovered long after deployment. Tagged memory has been proposed as a mechanism to enforce various fine-grained security policies at runtime. However, none of the existing tagged memory schemes provides efficient and flexible compartmentalization in terms of isolated execution environments.
We present TIMBER-V, a new tagged memory architecture featuring flexible and efficient isolation of code and data on small embedded systems. We overcome several limitations of previous schemes. We augment tag isolation with a memory protection unit to isolate individual processes, while maintaining low memory overhead. TIMBER-V significantly reduces the problem of memory fragmentation, and improves dynamic reuse of untrusted memory across security boundaries. TIMBER-V enables novel sharing of execution stacks across different security domains, in addition to interleaved heaps. TIMBER-V is compatible to existing code, supports real-time constraints and is open source. We show the efficiency of TIMBER-V by evaluating our proof-of-concept implementation on the RISC-V simulator.
-
Virgil D. Gligor (Carnegie Mellon University), Maverick S. L. Woo (Carnegie Mellon University)
Root-of-Trust (RoT) establishment ensures either that the state of an untrusted system contains all and only content chosen by a trusted local verifier and the system code begins execution in that state, or that the verifier discovers the existence of unaccounted for content. This ensures program booting into system states that are free of persistent malware. An adversary can no longer retain undetected control of one's local system.
We establish RoT {em unconditionally}; i.e., without secrets, trusted hardware modules and instructions, or bounds on the adversary's computational power. The specification of a system's chipset and device controllers, and an external source of true random numbers, such as a commercially available quantum RNG, is all that is needed. Our system specifications are those of a concrete Word Random Access Machine (cWRAM) model -- the closest computation model to a real system with a large instruction set.
We define the requirements for RoT establishment and explain their differences from past attestation protocols. Then we introduce a RoT establishment protocol based on a new computation primitive with concrete (non-asymptotic) optimal space-time bounds in adversarial evaluation on the cWRAM. The new primitive is a randomized polynomial, which has $k$-independent uniform coefficients in a prime order field. Its collision properties are stronger than those of a $k$-independent (almost) universal hash function in cWRAM evaluations, and are sufficient to prove existence of malware-free states before RoT is established. Preliminary measurements show that randomized-polynomial performance is practical on commodity hardware even for very large $k$.
To prove the concrete optimality of randomized polynomials, we present a result of independent complexity interest: a Horner-rule program is uniquely optimal whenever the cWRAM execution space and time are simultaneously minimized.
-
Binghui Wang (Iowa State University), Jinyuan Jia (Iowa State University), Neil Zhenqiang Gong (Iowa State University)
Many security and privacy problems can be modeled as a graph classification problem, where nodes in the graph are classified by collective classification simultaneously. State- of-the-art collective classification methods for such graph-based security and privacy analytics follow the following paradigm: assign weights to edges of the graph, iteratively propagate reputation scores of nodes among the weighted graph, and use the final reputation scores to classify nodes in the graph. The key challenge is to assign edge weights such that an edge has a large weight if the two corresponding nodes have the same label, and a small weight otherwise. Although collective classification has been studied and applied for security and privacy problems for more than a decade, how to address this challenge is still an open question. For instance, most existing methods simply set a constant weight to all edges.
In this work, we propose a novel collective classification framework to address this long-standing challenge. We first formulate learning edge weights as an optimization problem, which quantifies the goals about the final reputation scores that we aim to achieve. However, it is computationally hard to solve the optimization problem because the final reputation scores depend on the edge weights in a very complex way. To address the computational challenge, we propose to jointly learn the edge weights and propagate the reputation scores, which is essentially an approximate solution to the optimization problem. We compare our framework with state-of-the-art methods for graph-based security and privacy analytics using four large-scale real-world datasets from various application scenarios such as Sybil detection in social networks, fake review detection in Yelp, and attribute inference attacks. Our results demonstrate that our framework achieves higher accuracies than state-of-the-art methods with an acceptable computational overhead.
-
Milad Nasr (University of Massachusetts Amherst), Sadegh Farhang (Pennsylvania State University), Amir Houmansadr (University of Massachusetts Amherst), Jens Grossklags (Technical University of Munich)
A core technique used by popular proxy-based circumvention systems like Tor is to privately and selectively distribute the IP addresses of circumvention proxies among censored clients to keep them unknown to the censors. In Tor, for instance, such privately shared proxies are known as bridges. A key challenge to this mechanism is the insider attack problem: censoring agents can impersonate benign censored clients in order to learn (and then block) the privately shared circumvention proxies. To minimize the risks of the insider attack threat, in-the-wild circumvention systems like Tor use various proxy assignment mechanisms in order to minimize the risk of proxy enumeration by the censors, while providing access to a large fraction of censored clients.
Unfortunately, existing proxy assignment mechanisms (like the one used by Tor) are based on ad hoc heuristics that offer no theoretical guarantees and are easily evaded in practice. In this paper, we take a systematic approach to the problem of proxy distribution in circumvention systems by establishing a game-theoretic framework. We model the proxy assignment problem as a game between circumvention system operators and the censors, and use game theory to derive the optimal strategies of each of the parties. Using our framework, we derive the best (optimal) proxy assignment mechanism of a circumvention system like Tor in the presence of the strongest censorship adversary who takes her best censorship actions.
We perform extensive simulations to evaluate our optimal proxy assignment algorithm under various adversarial and network settings. We show that the algorithm has superior performance compared to the state of the art, i.e., provides stronger resistance to censorship even against the strongest censorship adversary. Our study establishes a generic framework for optimal proxy assignment that can be applied to various types of circumvention systems and under various threat models. We conclude with lessons and recommendations for the design of proxy-based circumvention systems.
-
Shiqi Shen (National University of Singapore), Shweta Shinde (National University of Singapore), Soundarya Ramesh (National University of Singapore), Abhik Roychoudhury (National University of Singapore), Prateek Saxena (National University of Singapore)
Symbolic execution is a powerful technique for program analysis. However, it has many limitations in practical applicability: the path explosion problem encumbers scalability, the need for language-specific implementation, the inability to handle complex dependencies, and the limited expressiveness of theories supported by underlying satisfiability checkers. Often, relationships between variables of interest are not expressible directly as purely symbolic constraints. To this end, we present a new approach—neuro-symbolic execution—which learns an approximation of the relationship between program values of interest, as a neural network. We develop a procedure for checking satisfiability of mixed constraints, involving both symbolic expressions and neural representations. We implement our new approach in a tool called NeuEx as an extension of KLEE, a state-of-the-art dynamic symbolic execution engine. NeuEx finds 33 exploits in a benchmark of 7 programs within 12 hours. This is an improvement in the bug finding efficacy of 94% over vanilla KLEE. We show that this new approach drives execution down difficult paths on which KLEE and other DSE extensions get stuck, eliminating limitations of purely SMT-based techniques.
-
Fei Zuo (University of South Carolina), Xiaopeng Li (University of South Carolina), Patrick Young (Temple University), Lannan Luo (University of South Carolina), Qiang Zeng (University of South Carolina), Zhexin Zhang (University of South Carolina)
Binary code analysis allows analyzing binary code without having access to the corresponding source code. It is widely used for vulnerability discovery, malware dissection, attack investigation, etc. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different
instruction set architectures, determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code from a different architecture. The solutions to these two problems have many applications, such as cross-architecture code plagiarism detection, malware identification, and vulnerability discovery.Despite the evident importance of Problem I, existing solutions are either inefficient or imprecise. Inspired by Neural Machine Translation (NMT), which is a new approach that tackles text across natural languages very well, we regard instructions as words and basic blocks as sentences, and propose a novel cross-(assembly)-lingual deep learning approach to solving the first problem, attaining high efficiency and precision. Regarding Problem II, many solutions have been proposed recently to solve this issue at the function level. However, performing cross-architecture code similarity comparison beyond function pairs is a new and more challenging endeavor. Employing our technique for cross-architecture basic-block comparison, we propose an effective solution to Problem II. We implement a prototype system and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.