Transferable Clean-label Poisoning Attacks on Deep Neural Nets

Chen Zhu*, W. Ronny Huang*, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein

June 2019

Preprint PDF Code Project Poster Slides Video Source

Abstract

Clean-label poisoning attacks inject innocuous looking (and correctly labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data. We consider transferable poisoning attacks that succeed without access to the victim network’s outputs, architecture, or (in some cases) training data. To achieve this, we propose a new polytope attack in which poison images are designed to surround the targeted image in feature space. We also demonstrate that using Dropout during poison creation helps to enhance transferability of this attack. We achieve transferable attack success rates of over 50% while poisoning only 1% of the training set.

Type

Conference paper

Publication

International Conference on Machine Learning (ICML 2019)

Authors contributed equally