Understanding and Taming Deterministic Model Bit Flip Attacks in Deep Neural Networks

Deep neural network (DNN) is widely deployed for a variety of decision-making tasks such as access control, medical diagnostics, and autonomous driving. Compromise of DNN models can severely disrupt inference behavior, leading to catastrophic outcomes for security and safety-sensitive applications. While a tremendous amount of efforts have been made to secure DNNs against external adversaries (e.g., adversarial examples), internal adversaries that tamper DNN model integrity through exploiting hardware threats (i.e., fault injection attacks) can raise unprecedented concerns. This project aims to offer insights into DNN security issues due to hardware-based fault attacks, and explore ways to promote the robustness and security of future deep learning system against such internal adversaries.

This project targets one critical research topic, namely securing deep learning systems against hardware-based model tampering. Recent advances in hardware fault attacks (e.g., rowhammer) can deterministically inject faults to DNN models, causing bit flips in key DNN parameters including model weights. Such threats can be extremely dangerous as they could potentially enable malicious manipulation of prediction outcomes in the inference stage by the adversary. The project seeks to systematically understand the practicality and severity of DNN model bit flip attacks in real systems and investigate software/architecture level protection techniques to secure DNNs against internal tampering. The study focuses on quantized DNNs which exhibit higher robustness against model tampering. This project will incorporate the following research efforts: (1) Investigate the vulnerability of quantized DNNs to deterministic bit flipping of model weights concerning various attack objectives; (2) Explore algorithmic approaches to enhance the intrinsic robustness of quantized DNN models; (3) Design effective and efficient system and architecture level defense mechanisms to comprehensively defeat DNN model bit flip attacks. This project will result in the dissemination of shared data, attack artifacts, algorithms and tools to the broader hardware security and AI security community.

Publications


2022

DeepSteal: Advanced Model Extractions Leveraging Efficient Weight Stealing in Memories
Adnan Siraj Rakin+, Md Hafizul Islam Chowdhuryy+, Fan Yao and Deliang Fan (+ Co-first authors)
In IEEE Symposium on Security and Privacy (S&P)
2022
On the Feasibility of Training-time Trojan Attacks Through Hardware-based Faults in Memory
Kunbei Cai, Zhenkai Zhang and Fan Yao
In IEEE International Symposium on Hardware Oriented Security and Trust (HOST)
2022
PDF

CLAIRVOYANCE: Exploiting Far-field EM Emanations of GPU to “See” Your DNN Models through Obstacles at a Distance
Sisheng Liang, Zihao Zhan, Fan Yao, Long Cheng and Zhenkai Zhang
In IEEE Workshop on Offensive Technologies colocated with S&P
2022
PDF

2021

T-BFA: Targeted Bit-Flip Adversarial Weight Attack
Adnan Siraj Rakin, Zhezhi He, Jingtao Li, Fan Yao, Chaitali Chakrabarti and Deliang Fan
In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
2021
PDF
NMT-Stroke: Diverting Neural Machine Translation through Hardware-based Faults
Kunbei Cai, Md Hafizul Islam Chowdhuryy, Zhenkai Zhang and Fan Yao
In IEEE International Symposium on Secure and Private Execution Environment Design (SEED)
2021

2020

DeepHammer: Depleting the Intelligence of Deep Neural Networks through Targeted Chain of Bit Flips
Fan Yao, Adnan Siraj Rakin and Deliang Fan
In USENIX Security
2020