The learning objective of this homework is for you to create a codebase to train and evaluate various deep neural network (DNN) models. You also need to analyze the impact of various factors (that you can control during training) on the final DNN models. You will use this codebase and the trained models to complete the homework assignments (HW 2, 3, and 4) throughout the term.
To begin with, you can choose any deep learning framework that you're already familiar with (e.g., PyTorch, TensorFlow, or ObJAX). If you are not familiar with any of these frameworks, you can start with PyTorch or TensorFlow (> v2.0). These are popular choices, and you can find many tutorials [link] or example code [link] from the Internet.
[Note] I do NOT recommend copying and pasting the sample code found from the Internet. It will be an easy solution for this homework. However, later on, you may have difficulty understanding the attacks and defenses. For example, some attacks (and defenses) require you to know how the deep learning framework computes gradients and how you can manipulate (or control) them.
Root
- models: a dir containing your model definitions.
- reports: a dir where you will include your write-up.
- datasets.py: a Python script containing functions for loading datasets.
- train.py: a Python script for training a model.
- train.sh: a bash-shell script for training multiple models.
- valid.py: a Python script for evaluating a pre-trained model.
...
Note that this is an example code structure; you can find many nice examples from the Internet [example].
.tar.gz
) that contains your code and a write-up as a PDF file. Put your write-up under the reports
folder. Your PDF write-up should contain the following things:
The learning objective of this homework is for you to attack your models built in Homework 1 with white-box adversarial examples. You will also use adversarial training to build your robust models. We then analyze the impact of several factors—that you can control as an attacker or a defender—on the success rate of attack (or defense). You can start this homework from the codebase you wrote for Homework 1.
adv_attack.py
and adv_train.py
. The rest are the same as Homework 1.
Root
- [New] adv_attack.py: a Python script to run adversarial attacks on a pre-trained model.
- [New] adv_train.py: a Python script for adversarial-training a model.
...
adv_attack.py
.
def PGD(x, y, model, loss, niter, epsilon, stepsize, randinit, ...)
- x: a clean sample
- y: the label of x
- model: a pre-trained DNN you're attacking
- loss: a loss you will use
- [PGD params.] niter: # of iterations
- [PGD params.] epsilon: l-inf epsilon bound
- [PGD params.] stepsize: the step-size for PGD
- [PGD params.] randinit: start from a random perturbation if set true
// You can add more arguments if required
This PGD function crafts the adversarial example for a sample (x, y) [or a batch of samples]. It takes (x, y), a pre-trained DNN, and attack parameters; and returns the adversarial example(s) (x', y). Note that you can add more arguments to this function if required. Please use the following attack hyper-parameters as a default:if __name__ == "__main__":
in the same file. Here, for all the 10k adversarial examples crafted, you will compute the classification accuracy on the DNN model you used. Note that you will observe much less accuracy than what you can observe on the clean test-time samples.train.py
and name it adv_train.py
. We will convert the normal training process into adversarial training. In train.py
, we train a model on a batch of clean training samples (in each batch). Instead, you need to make adversarial examples on the batch of clean samples and train your models on them. Note that this is slightly different from the work by Goodfellow et al...png
files. Upload them on one of the image classification demos and see how the predicted labels are different compared to your DNNs..tar.gz
) that contains your code and a write-up as a PDF file. Put your write-up under the reports
folder. Your PDF write-up should contain the following things:
The learning objective of this homework is for you to perform data poisoning attacks on machine learning models (some of the attacks will require the neural networks trained in Homework 1). You will also test the effectiveness of simple defenses against the poisoning attacks you will implement. You can start this homework from the codebase you wrote in Homework 1.
poison_craft.py
, poisoning.py
, and poison_remove.py
. The rest are the same as Homework 1.
Root
- [New] poison_craft.py : a Python script to craft poisoning samples.
- [New] poison.py : a Python script for training a model on a contaminated dataset.
- [New] poison_remove.py: a Python script for removing suspicious samples from a contaminated dataset.
...
X%
samples in the original training set. For example, you can select 10% of the MNIST-1/7 training samples (~1.7k) and flip their labels from 0 to 1 (or vice versa).poison_craft.py
.
def craft_random_lflip(train_set, ratio):
- train_set: an instance for the training dataset
- ratio : the percentage of samples whose labels will be flipped
// You can add more arguments if required
This function constructs a training set that has ratio%
of poisons. The train_set
is an instance of the clean training set and the ratio
is a number between 0 and 1. Note that this is an example of writing a function for crafting poisoned training sets. Please feel free to use your own function if that is more convenient.poison.py
. This script will be mostly the same as `train.py`, but the only difference is you load the contaminated training set instead of the clean training data. Once loaded, the rest will be the same.poison_craft.py
.
def craft_clabel_poisons(model, target, bases, niter, lr, beta, ...):
- model : a pre-trained ResNet18
- target: a target sample (a frog)
- bases : a set of base samples (dogs)
- niter : number of optimization iterations
- lr : learning rate for your optimization
- beta : hyper-parameter (refer to the paper)
// You can add more arguments if required
This function crafts clean-label poisons. It takes a model
(ResNet18) to extract features for a single target and 100 base samples. It also takes optimization hyper-parameters such as niter
, lr
, beta
, etc. Once the function sufficiently optimizes your poisons, it will return 100 poisons crafted from the bases. Please refer to the author's code, the community implementations, and the original study for reference.poison.py
. This script will be mostly the same as train.py
, but the only difference is you load the contaminated training set instead of the clean training data. Once loaded, the rest will be almost the same.any
samples from the MNIST-1/7 training set. You will use this (D_v) to remove poisons from the training data.
X%
(a hyper-parameter of your choice), remove D_tr_i from the training set and continue.
X%
values and check how many poisons you removed in each case. You also need to check how the accuracy of your model is after removing suspicious samples (i.e., you will examine the effectiveness of RONI defense).
any
successful attack (i.e., choose a target and 100 poisons)..tar.gz
or .zip
) that contains your code and a write-up as a PDF file. Please do not include datasets and models in your submission. Put your write-up under the reports
folder. Your PDF write-up should contain the following things: