SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning

Charlie Hou (CMU, IC3), Mingxun Zhou (Peking University), Yan Ji (Cornell Tech, IC3), Phil Daian (Cornell Tech, IC3), Florian Tramèr (Stanford University), Giulia Fanti (CMU, IC3), Ari Juels (Cornell Tech, IC3)

Incentive mechanisms are central to the functionality of permissionless blockchains: they incentivize participants to run and secure the underlying consensus protocol. Designing incentive-compatible incentive mechanisms is notoriously challenging, however. As a result, most public blockchains today use incentive mechanisms whose security properties are poorly understood and largely untested. In this work, we propose SquirRL, a framework for using deep reinforcement learning to analyze attacks on blockchain incentive mechanisms. We demonstrate SquirRL’s power by ﬁrst recovering known attacks: (1) the optimal selﬁsh mining attack in Bitcoin [52], and (2) the Nash equilibrium in block withholding attacks [16]. We also use SquirRL to obtain several novel empirical results. First, we discover a counterintuitive ﬂaw in the widely used rushing adversary model when applied to multi-agent Markov games with incomplete information. Second, we demonstrate that the optimal selﬁsh mining strategy identiﬁed in [52] is actually not a Nash equilibrium in the multi-agent selﬁsh mining setting. In fact, our results suggest (but do not prove) that when more than two competing agents engage in selﬁsh mining, there is no proﬁtable Nash equilibrium. This is consistent with the lack of observed selﬁsh mining in the wild. Third, we ﬁnd a novel attack on a simpliﬁed version of Ethereum’s ﬁnalization mechanism, Casper the Friendly Finality Gadget (FFG) that allows a strategic agent to amplify her rewards by up to 30%. Notably, [10] shows that honest voting is a Nash equilibrium in Casper FFG; our attack shows that when Casper FFG is composed with selﬁsh mining, this is no longer the case. Altogether, our experiments demonstrate SquirRL’s ﬂexibility and promise as a framework for studying attack settings that have thus far eluded theoretical and empirical understanding.

Paper

Video

SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning

View More Papers

TASE: Reducing Latency of Symbolic Execution with Transactional Memory

Hashomer – Privacy-Preserving Bluetooth Based Contact Tracing Scheme for...

Ovid: Message-based Automatic Contact Tracing

Towards Defeating Mass Surveillance and SARS-CoV-2: The Pronto-C2 Fully...