Homework 2
For this homework, you will be classifying images from the SuperTuxKart dataset using a convolutional neural network.
This assignment, as with all of the homework assignments, should be completed individually without sharing solutions, models, or ideas with other students. See details at the bottom of this page.
Starter Code
We provide some starter code for this homework here. This zip file contains the following:
bundle.py
, a script you can use to zip up your homework for submission. In order to submit your homework, runpython bundle.py homework <uteid>
which will create a file called<uteid>.zip
. This is the file you should submit.grader
contains a local grader which you can use to evaluate your homework and see what grade you would currently get. Note that the local grader and our grader use different test sets, so this grader is not guaranteed to be 100% accurate.homework
contains the code you will modify for this assignment.
You can run the local grader at any time by using the following command:
python -m grader homework -v
As in the first homework assignment, we will be using the SuperTuxKart dataset.
This data should be placed at homework2/data/
where homnework2
is the
directory contained in the above zip file. Since you already downloaded the
dataset for the first homework, you may want to symlink the data to save
storage space.
CNN Model (10 pts)
For this section, you will implement the CNNClassifier
class in models.py
.
As in the first homework, your model should return a (B, 6)
tensor
representing the logits of the classes. However, for this assignment your model
should use convolutional layers rather than (or in addition to) fully connected
layers.
Take a look at torch.nn.Conv2d.
Logging (30 pts)
In this part, you will have to use tensorboard to log training data. Inside of
homework/acc_logging.py
there is a dummy training procedure and two
tb.SummaryWriter
objects. Your task is to use those two SummaryWriter
objects to log the training loss at each iteration, the training accuracy at
each epoch and the validation accuracy at each epoch. A simple example of
tensorboard usage is
import torch.utils.tensorboard as tb
logger = tb.SummaryWriter('cnn')
logger.add_scalar('train/loss', t_loss, 0)
There are more examples of tensorboard usage in the in-class exercises we have been working on this semester.
Note that you should not create your own SummaryWriter
object inside
acc_logging.py
because that will break the grader. Use the two provided
objects instead.
You can test the logger by running python3 -m homework.acc_logging <log>
where <log>
is the name of some directory. Then start tensorboard with
tensorboard --logdir <log>
to view the results of the logging.
Training your CNN (60 pts)
The bulk of this homework is in this section, where we train a CNN to classify
SuperTuxKart images. For this, you will likely want to reuse some of your
training code from the first homework. I would also recommend using the logging
functionality from the previous section to track your runs. This will help you
to tune your hyperparameters. Once your model is trained, you can visualize its
outputs for a few inputs by running python -m homework.viz_prediction <dataset>
.
You can use python -m grader homework -v
to test your solution. We are
looking for an accuracy of at leats 85% for full credit, but once again, our
grader uses a test dataset which is different from the dataset used by the
local grader.
Note that the grader does not do any transformation on the dataset – if you
want to normalize the input images, do it in models.py
.
Grading
The test grader can be run with
python -m grader homework -v
Note that this grader is using a different set of test data than we use for our grader, so there may be some difference between the two. The distributions will be the same so your grade should be similar between the two graders, unless the model is massively overfit to the validation set.
Submission
Once you are ready to submit, create a bundle by running
python bundle.py homework <eid>
then submit the resulting zip file on Canvas. Note that the grader has a maximum file size limit of 20MB. You shouldn’t run into this limit unless your models are much larger than they need to be. You can check that your homework was bundled properly by grading it again with
python -m grader <eid>.zip
Online Grader
The online grading system uses a slightly modified version of Python, so please make sure your code follows these rules:
- Do not use
exit
orsys.exit
as this will likely crash the grader. - Do not try to access files except for the ones provided with the homework zip file. Writing files is disabled.
- Network access is disabled in the grader. Make sure you are reading your
dataset from the
data
folder rather than from a network connection somewhere. - Do not fork or spawn new processes.
print
andsys.stdout.write
are ignored by the grader.
Honor Code
You should do this homework individually. You are allowed to discuss high-level ideas and general structure with each other, but not specific details about code, architecture, or hyperparameters. You may consult online sources, but don’t copy code directly from any posts you find. Cite any ideas you take from online sources in your code (include the full URL where you found the idea). You may refer to your own solutions or the master solutions to prior homework assignments from this class as well as any iPython notebooks from this class. Do not put your solution in a public place (e.g., a public GitHub repo).
Acknowledgements
This assignment is very lightly modified from one created by Philipp Krähenbühl.