Diogo Barradas (Instituto Superior Técnico, Universidade de Lisboa)

The advent of programmable switches has sparked a general interest in devising new security solutions for high-speed networks. Recently, we introduced FlowLens, a system that leverages programmable switches to efficiently support multi-purpose security network applications based on machine learning algorithms. With FlowLens, network operators are able to program their switches to automatically scan and classify flows with high accuracy for a wide range of scenarios, such as multimedia covert channel detection, website fingerprinting, or botnet traffic identification. To make this possible, FlowLens introduces a new system design that solves a fundamental tension between the need for comprehensive flow information required by machine learning algorithms and the scarcity of hardware resources available in modern programmable switches.

To tackle this tension, we faced several major challenges at the implementation and evaluation levels that have raised the bar in proving the feasibility and effectiveness of our design. First, we identified a substantial gap between the programming environment (based on the P4 programming language) targeting a software-emulated switch and a real-world proprietary switch (e.g., the Barefoot Tofino). This gap forced us to deeply restructure our code and revisit our assumptions underpinning our original flow compression technique. Second, we realized that different machine learning security tasks proposed in the literature had been fine-tuned for their specific application domains. This means that not only do they employ different classification algorithms but even the datasets used and the training processes are different from one another. As such, we had to adopt several strategies to repurpose the classification machinery of previously existing applications to ensure their compatibility with FlowLens. Lastly, the comparison between our compression technique and other related compression techniques was hampered by the lack of accessibility to the latter’s implementation. This forced us to re-implement several of such approaches and to resort to analytical comparisons of their compute, storage, and communication costs.

In this presentation, we discuss in detail how we addressed the above challenges and provide a set of guidelines that may prove useful for future practitioners in the realm of the intersection between network security and machine learning.

Speaker's biography

Diogo Barradas is a Ph.D. candidate in Information Systems and Computer Engineering at Instituto Superior Técnico, Universidade de Lisboa. He received his BSc. (2014) and MSc. (2016) from the same institution. His main research interests include network security and privacy, with particular emphasis on statistical traffic analysis and Internet censorship circumvention. He conducts his research at the Distributed Systems Group at INESC-ID Lisboa.

View More Papers

On Building the Data-Oblivious Virtual Environment

Tushar Jois (Johns Hopkins University), Hyun Bin Lee, Christopher Fletcher, Carl A. Gunter (University of Illinois at Urbana-Champaign)

Read More

Trusted Verification of Over-the-Air (OTA) Secure Software Updates on...

Anway Mukherjee, Ryan Gerdes, and Tam Chantem (Virginia Tech)

Read More

KUBO: Precise and Scalable Detection of User-triggerable Undefined Behavior...

Changming Liu (Northeastern University), Yaohui Chen (Facebook Inc.), Long Lu (Northeastern University)

Read More

When DNS Goes Dark: Understanding Privacy and Shaping Policy...

Vijay k. Gurbani and Cynthia Hood ( Illinois Institute of Technology), Anita Nikolich (University of Illinois), Henning Schulzrinne (Columbia University) and Radu State (University of Luxembourg)

Read More