CS-498 Applied Machine Learning
D.A. Forsyth --- 3310 Siebel Center
Office Hours Time: WF 14h00-15h00, Location: 3310 Siebel
Alternative locations may be available
TA's:
- Tanmay Gangwani gangwan2@illinois.edu
- Tiantian Fang tf6@illinois.edu
Important; Important; Important
We are no longer meeting in person. I will release movies,
readings and homeworks to keep the course running. You should have
received email from me about this.
Key points:
- Please fill in
this survey
- For each lecture, there will be readings and movies on the web
page. Look below at the table of movies.
- I will be on zoom 12h30-15h00 CST WF. I will give a brief
introduction to the movies, then be available to discuss the
movies, answer questions on past movies, and explain matters that
seem confusing. I will email out the
link for the meeting before each class.
Grading and cutoffs
I and the TA's have received repeated questions about the precise
location of grade cutoffs. I won't bind myself answering these, but
here is some general advice: if you've done the homework, worry about
something else. It’s my intention that all students who do all homework
reasonably well will get an A; experience with the class
in the past has been that meant about 2/3 of students did so, about
1/4 were in the Bs, and the rest were scattered cause they didn’t do their homework.
Four hour vs Three hour version of the course
If you're doing the four hour version of the course, you
must do the extra homework, below. This is at peril of
not having done all the homework. It's informative, and not hard.
Course Content 13-Mar and on
Records of our meetings
Class Date |
Brief movie |
Chat |
25 Mar |
Part of my intro to
each of the days movies |
chat record |
27 Mar |
Part of my intro to
each of the days movies; this is ridiculously big (0.4G), sorry, I'll
keep trying. |
chat record |
01 Apr |
Part of my intro to
each of the days movies. |
chat record |
03 Apr |
Part of my intro to
each of the days movies, with a fair amount of Q+A
on the homeworks, particularly EM (big, sorry!). |
chat record |
08 Apr |
Part of my intro to
each of the days movies, with a fair amount of Q+A
on the homeworks, particularly EM (big, sorry!). |
No chat to report |
10 Apr |
Part of my intro to
each of the days movies. I'm asked how to avoid ennui in
these very trying times. Answer (which is right) I don't
think you can; best you can do is manage it. As an exercise, would you think well of someone
who wasn't troubled by what's going on? |
No chat to report |
15 Apr |
My intro to
each of the days movies. AND
an answer to a question about
graphical models. |
No chat to report |
17 Apr |
Questions and answers. |
No chat to report |
22 Apr |
Web page summary; questions and answers. AND
Another question and answers.
|
No chat to report |
24 Apr |
Brief update AND
Q+A AND
more Q+A.
|
No chat to report |
29 Apr |
A lecture on image classifiers AND
Q+A on variational derivation.
|
No chat to report |
1 May |
A Lecture on object detectors
|
No chat to report |
6 May |
A Lecture on TSNE, autoencoders and generative models, with Q+A
|
No chat to report |
Short Movies
Long Movies
Class Date |
Readings |
Movies |
13 Mar |
ch. 11 |
Bias+Variance; simple model selection; IRLS
|
13 Mar/25 Mar |
|
Generalized linear models; Lasso;
|
25 Mar |
|
More generalized linear models; Lasso; Elastic net; some other stuff which
you can ignore
|
27 Mar, 1 April |
12.1, 12.2 |
Boosting and Gradient Boost
|
3 April |
13.1 |
Introductory Markov chains (the chapter reference is wrong - I
changed the chapter numbers - it's an old movie)
|
3 April/8 April |
13.2 and 13.3 |
Simulating Markov chains; text models; Hidden Markov Models;
dynamic programming
|
10 April |
13.3 |
Learning an HMM from data using EM
|
15 April |
14.1,14.2, 14.3 |
Some graphical models, structure learning points (last
1/2 hour of the movie! not ideal, but what I have; short movies are better)
|
17 April |
14.3, 15 |
Some more structure learning leading into mean field ideas
(not ideal, but what I have; short movies are better).
|
22 April |
16.1-16.2 |
Units, stochastic gradient descent, and building a simple classifier
(not ideal, but what I have; short movies are better).
|
24 April |
16.3-17.1 |
Backpropagation, convolutional layers and gradient tricks
(not ideal, but what I have; short movies are better).
|
29 Apr |
17.2, 18 |
Image classifiers (same as short movie)
|
1 May |
18 |
Object detectors
|
6 May |
19 |
A Lecture on TSNE, autoencoders and generative models, with Q+A
|
Announcements
Announcements page - check this frequently!
LINK ISN'T BROKEN I will be absent 31 Jan
(sorry!). Also absent 21 Feb, aargh!
Also absent 13 Mar, mild signs of illness so self-isolating,
aargh!
Check for movies!
Contact policy
I'm
quite distracted and am focusing on content preparation. Generally,
please do not bring DAF an issue you haven't already raised with a TA.
Questions I've been getting a lot
Getting into the class
In the past, we've been able to admit everyone who wanted to get into
the in-person version of the class after the first rush settled down.
Will this be true this semester? who knows? not me. PLEASE do not
come and tell me that you really want to get in, or your cat died and
its last words were you should take the class, or something. We're not going
to go over an enrollment of 100. Corollary: If you plan dropping, do so early; someone else wants your seat.
Can I get in even though I won't be able to come to lecture cause I'm doing
something else, but I'll watch the movies. I think this strategy is unwise, but I suppose
it's not really my problem.
Can I audit? The main resource limits on the
physical class are physical seats in the room. We cannot have an
overcrowded room. If physical seats are open, sure (I'm always happy
to have an audience); but please don't take a seat that should be
occupied by someone who is registered
Important contact advice
A really common question is: how do I do something in R? Usually, I
get the answer to this by searching; I use Google, but you may have a
preferred search. If you ask me or a TA this question, and we do this
it in front of you successfully you should feel a little embarrassed
cause you could have done this for yourself. Warning:
we will embarrass you in this way; it's better to do this
sort of thing for yourself.
Office Hours
- DAF: 14h00-15h00 WF
- Tanmay Gangwani Mon, 10h00-11h00; Thur 10h00-11h00 Siebel 0207.
- Tiantian Fang Tue, 15h30-17h30 Siebel 0207.
Piazza
https://piazza.com/class/k62hiyy1jr81jm?cid=8
Evaluation is by: Homeworks and take home final.
I will shortly post a policy on collaboration and plagiarism
please complete this. I intend it to be anonymous (but don't fully understand google forms, so...) and it will help me know what you know already. Please don't fill in lots of forms to bias, etc.
Homeworks
A total of 8 (used to be 11, but pandemic intervened) homeworks will appear here. There will be no final exam
- one homework will be designated a take-home final.
Homework DRAFTS
I will post drafts of the homeworks here, when we're fairly sure what we want you
to do, but are working out submission details, etc.
Submission details:
Please submit a single PDF to Gradescope. It should contain your solutions to all the problems included in the homework. If the homework involves coding, please include the code in the same PDF (at the end, as an Appendix). This makes the life of graders easy. If there’s too much code, such that the PDF size exceeds the maximum allowed on Gradescope, submit a separate .zip for the code.
- Homework 1(here) due 10 Feb 2020. due date changed via piazza announcement to 17 Feb 23h59
- Homework 2(here) due 24 Feb 2020.Note date change!
- Homework 3(here) due 2 Mar 2020.
- Homework 4(here) due 30 Mar 2020.
- Homework 5(here) due 13 Apr
2020.
I've created some confusion with this homework,
for which I apologize. In (a) you should observe that
the cluster centers drift together. This happens
because at this scale, all the pixels are essentially
one cluster; the fact that they're not only becomes
apparent when you do (c). So the expected behavior
is: for (a), the centers drift together, and the maps
in (b) are all pretty much the same; for (c) the
centers do not drift, and the maps are rather
different. Apologies: I forgot some details.
AAARGH! I got a constant wrong in
(c). You will find (c) does not drift when you use a
covariance of (1/400)*Identity rather than
(1/10)*Identity. This is equivalent to pixels on a
scale of 0-20. If you've already submitted, no
worries; if you haven't, you'll find the whole thing a
great deal less puzzling if you use (1/400)*Identity.
SORRY!
- Homework 6(here) due 20 Apr
2020.
- Homework 7(here) due 27 Apr
2020.
AAARGH! I got a constant wrong
here. Use 0.5 rather than 2 for the theta connecting
the hidden state to the observation. If you've
already handed in, no worries; if you haven't, you'll
find the whole thing works a
a lot better if you do this
SORRY!
AAARGH! There is a typo on p 363
that may affect your ability to do this homework; it
won't if you thrash through the math, but it'll save
you time to know about it. About halfway down the
page there are two equations. One reads a=e^(thing in
parens) and the other reads b=e^(another thing in
parens). The first should read a=(thing in parens)
and the second should read b=(another thing in parens).
SORRY!
- Homework 8(here) due end of semester.
- Extra Homework for Four
Hour version(here) due end of semester.
Optional non graded homeworks
As is usual during pandemics, some people have time on their hands and
want to do more, and some have problems to deal with. I've cut down
the set of homeworks to accommodate people with problems to deal
with. If you have time on your hands, and would like to use it
learning more learning, here are some good choices of book exercise.
I and TA's will try to help you with problems with these.
- Regression
- If you want to pick up some of the math, 10.6, 10.7, 10.8 are
good places to start
- If you want to do some practical stuff, 11.2, 11.3 and 11.4
are worth doing
- Boosting
- If you want to pick up some of the math, 12.3 (last one is hard), 12.4 are
good places to start
- If you want to do some practical stuff, any of the
programming exercises, but you want to install
XGBoost, with the multithreading option. I found this easy on
Linux, and really quite exciting (but I got it to work) on a Mac
- HMMs and Graphical Models
- If you want to pick up some of the math, 13.1-3 are
good places to start
- If you want to do some practical stuff, any of the
programming exercises, but 13.6 and 13.7 are a
serious performance
- If you want to pick up some structure learning, 14.2 is a
place to start
- Deep Learning
- You will get tremendous value from being able to do 16.1-16.4
- If you want to do some practical stuff, any one of 16.5-16.7
is a good place to start; all three is likely too much
- I've already set 17.1 and 17.2. Doing 17.3, 17.4 is
helpful.
A really sensible thing to do is to try and make 17.4 work better
by trying other tricks from chapter 18.
- If you have a GPU, etc, 18.2, 18.3, 18.4 are a good idea
Syllabus:
I will start at the beginning of the textbook and
proceed to the end, covering approximately one chapter per week. You'll notice there are 19
substantive chapters and 15 weeks; this is to allow a little
spreading out, but in week N I expect to be close to chapter 15*N/19.
Read the textbook. I wrote it specifically for this
course, AND it's free. I will split time in lecture between sketching
important points described in the text, and solving problems. If you
haven't read the text, this might be quite puzzling.
Required Text:
Applied Machine Learning D.A. Forsyth, Springer, 2019
Important In the past, people have brought the pdf with
them on mobile devices. I think this is a good idea. Or you could buy a paper copy. The PDF is a
free download from the UIUC library (you have
to be on the intranet to download it, I think)
Piazza link
TBA
I'm a video star! (or at least, I have been filmed)
- you can see me here though you'll need to log in, and it may take a short while after class to be ready
Current movies
Class Date |
PM-low-res |
22-Jan |
low-resClassification; Nearest neighbors |
24-Jan |
low-resNaive Bayes; SVMs |
29-Jan |
low-resLearning theory |
31-Jan |
low-resHigh dimensional data; Covariance matrices |
5-Feb |
low-resDiagonalizing covariance matrices; multivariate normals |
7-Feb |
low-resPrincipal components |
12-Feb |
low-resMore PCA; NIPALS |
14-Feb |
low-resLow rank models |
19-Feb |
low-rescanonical correlation analysis |
21-Feb |
We did not meet; movies in lieu on Announcements page
|
26-Feb |
low-resVector quantization |
28-Feb |
low-resend VQ; start EM |
4-Mar |
low-resEM; start regression |
6-Mar |
low-resbasic
least squares regression; R-squared; outliers; |
11-Mar |
low-res
multivariate regression; using leverage, Cook's distance,
standardized residuals; bias and variance
|
13-Mar |
We did not meet; movies in lieu on Announcements page
|
Backup Material:
Probability and Statistics for Computer Science,
D.A. Forsyth
- I can no longer release a PDF, as this has been published. The
moire effect on the cover picture is the result of my scanner
interacting with a shiny cover.
Code fragments I showed in class:
I've cleaned some of these up a bit, but they're not intended to be production code, etc;
just to show some R tricks. Among other things, these codes contain
known errors!
- A naive bayes classifier on the Pima indians dataset; I averaged over 10 test train splits, and ignored examples with NA values; mainly interesting for simple code tricks. File here.
- A naive bayes classifier on the Pima indians dataset; I averaged over 10 test train splits, but now I used examples with NA values both in train and test; mainly interesting for simple code tricks. File here.
- A naive bayes classifier on the Pima indians dataset, using Klar and Caret; mainly interesting for simple code tricks. File here.
- An SVM on the Pima indians dataset, using Klar and Caret and SVMLight; mainly interesting for simple code tricks. File here.
- A much more elaborate SVM on the Pima indians dataset, using Klar and Caret and SVMLight. File here.
R resources: