CS-498 Applied Machine Learning

D.A. Forsyth --- 3310 Siebel Center

Office Hours Time: WF 14h00-15h00, Location: 3310 Siebel

DAF hangs in sidemount scuba gear, with
left tank in slightly poor trim, over a small submerged wreck

interior of a sunken bus, in gloomy lighting,
with
DAF in sidemount scuba gear looking to the right of view, and pointing
a light

Alternative locations may be available

TA's:

Tanmay Gangwani gangwan2@illinois.edu
Tiantian Fang tf6@illinois.edu

Important; Important; Important

We are no longer meeting in person. I will release movies, readings and homeworks to keep the course running. You should have received email from me about this.

Key points:

Please fill in this survey
For each lecture, there will be readings and movies on the web page. Look below at the table of movies.
I will be on zoom 12h30-15h00 CST WF. I will give a brief introduction to the movies, then be available to discuss the movies, answer questions on past movies, and explain matters that seem confusing. I will email out the link for the meeting before each class.

Grading and cutoffs

I and the TA's have received repeated questions about the precise location of grade cutoffs. I won't bind myself answering these, but here is some general advice: if you've done the homework, worry about something else. It’s my intention that all students who do all homework reasonably well will get an A; experience with the class in the past has been that meant about 2/3 of students did so, about 1/4 were in the Bs, and the rest were scattered cause they didn’t do their homework.

Four hour vs Three hour version of the course

If you're doing the four hour version of the course, you must do the extra homework, below. This is at peril of not having done all the homework. It's informative, and not hard.

Course Content 13-Mar and on

Records of our meetings

Class Date	Brief movie	Chat
25 Mar	Part of my intro to each of the days movies	chat record
27 Mar	Part of my intro to each of the days movies; this is ridiculously big (0.4G), sorry, I'll keep trying.	chat record
01 Apr	Part of my intro to each of the days movies.	chat record
03 Apr	Part of my intro to each of the days movies, with a fair amount of Q+A on the homeworks, particularly EM (big, sorry!).	chat record
08 Apr	Part of my intro to each of the days movies, with a fair amount of Q+A on the homeworks, particularly EM (big, sorry!).	No chat to report
10 Apr	Part of my intro to each of the days movies. I'm asked how to avoid ennui in these very trying times. Answer (which is right) I don't think you can; best you can do is manage it. As an exercise, would you think well of someone who wasn't troubled by what's going on?	No chat to report
15 Apr	My intro to each of the days movies. AND an answer to a question about graphical models.	No chat to report
17 Apr	Questions and answers.	No chat to report
22 Apr	Web page summary; questions and answers. AND Another question and answers.	No chat to report
24 Apr	Brief update AND Q+A AND more Q+A.	No chat to report
29 Apr	A lecture on image classifiers AND Q+A on variational derivation.	No chat to report
1 May	A Lecture on object detectors	No chat to report
6 May	A Lecture on TSNE, autoencoders and generative models, with Q+A	No chat to report

Short Movies

Class Date	Readings	Movies
13 Mar	ch. 11, up to 11.4	bias+variance (end of 11 Mar lecture)
13 Mar		Simple model selection
13 Mar		Robust regression using IRLS
13 Mar		Regression using Generalized Linear Models
25 Mar	Finish ch. 11	Regression using the Lasso
25 Mar		Regression using Elastic Net
27 Mar	12.1	Greedy stagewise regression
27 Mar/1 April	12.2	Gradient boost
1 April	12.2	Gradient boosting decision stumps
1 April	12.2	Gradient boosting regression trees
3 April	13.1	Markov chains - basic ideas
3 April	13.2	Simulating a Markov Chain
3 April	13.2	Text models with Markov Chains
8 April	13.2	Hidden Markov Models -basic ideas
8 April	13.2	Hidden Markov Models - Dynamic Programming
8 April	13.2	Hidden Markov Models - an example
10 April	13.3	Learning an HMM from data using EM (sorry, no short movie)
15 April	14.1	Graphical Models
15 April	14.2	Conditional Random Fields
15 April	14.3	Setting up structure learning (this is optional)
15 April	14.3	Doing structure learning (this is optional)
17 April	15.1	Boltzmann machines
17 April	15.1	Discrete Markov Random Fields
17 April	15.2	Variational Free Energy
17 April	15.2, 15.3	Mean Field Inference
22 April	16.1	Deep nets: Simple units
22 April	16.1 (also read 16.2)	Deep nets: Gradients
22 April	16.3 (also, read 16.4)	Deep nets: Backpropagation
24 April	16.4	Deep nets: Gradient tricks
24 April	17.1	Deep nets: Convolutional layers
24 April	17.1 (read 17.2)	Deep nets: More on convolutional layers
29 Apr	17.2, 18	Image classifiers
1 May	18	Object detectors

Long Movies

Class Date	Readings	Movies
13 Mar	ch. 11	Bias+Variance; simple model selection; IRLS
13 Mar/25 Mar		Generalized linear models; Lasso;
25 Mar		More generalized linear models; Lasso; Elastic net; some other stuff which you can ignore
27 Mar, 1 April	12.1, 12.2	Boosting and Gradient Boost
3 April	13.1	Introductory Markov chains (the chapter reference is wrong - I changed the chapter numbers - it's an old movie)
3 April/8 April	13.2 and 13.3	Simulating Markov chains; text models; Hidden Markov Models; dynamic programming
10 April	13.3	Learning an HMM from data using EM
15 April	14.1,14.2, 14.3	Some graphical models, structure learning points (last 1/2 hour of the movie! not ideal, but what I have; short movies are better)
17 April	14.3, 15	Some more structure learning leading into mean field ideas (not ideal, but what I have; short movies are better).
22 April	16.1-16.2	Units, stochastic gradient descent, and building a simple classifier (not ideal, but what I have; short movies are better).
24 April	16.3-17.1	Backpropagation, convolutional layers and gradient tricks (not ideal, but what I have; short movies are better).
29 Apr	17.2, 18	Image classifiers (same as short movie)
1 May	18	Object detectors
6 May	19	A Lecture on TSNE, autoencoders and generative models, with Q+A

Announcements

Announcements page - check this frequently!

LINK ISN'T BROKEN I will be absent 31 Jan (sorry!). Also absent 21 Feb, aargh! Also absent 13 Mar, mild signs of illness so self-isolating, aargh! Check for movies!

Contact policy

I'm quite distracted and am focusing on content preparation. Generally, please do not bring DAF an issue you haven't already raised with a TA.

Questions I've been getting a lot

Getting into the class In the past, we've been able to admit everyone who wanted to get into the in-person version of the class after the first rush settled down. Will this be true this semester? who knows? not me. PLEASE do not come and tell me that you really want to get in, or your cat died and its last words were you should take the class, or something. We're not going to go over an enrollment of 100. Corollary: If you plan dropping, do so early; someone else wants your seat.

Can I get in even though I won't be able to come to lecture cause I'm doing something else, but I'll watch the movies. I think this strategy is unwise, but I suppose it's not really my problem.

Can I audit? The main resource limits on the physical class are physical seats in the room. We cannot have an overcrowded room. If physical seats are open, sure (I'm always happy to have an audience); but please don't take a seat that should be occupied by someone who is registered

Important contact advice

A really common question is: how do I do something in R? Usually, I get the answer to this by searching; I use Google, but you may have a preferred search. If you ask me or a TA this question, and we do this it in front of you successfully you should feel a little embarrassed cause you could have done this for yourself. Warning: we will embarrass you in this way; it's better to do this sort of thing for yourself.

Office Hours

DAF: 14h00-15h00 WF

Tanmay Gangwani Mon, 10h00-11h00; Thur 10h00-11h00 Siebel 0207.
Tiantian Fang Tue, 15h30-17h30 Siebel 0207.

Piazza

https://piazza.com/class/k62hiyy1jr81jm?cid=8

Evaluation is by: Homeworks and take home final.

I will shortly post a policy on collaboration and plagiarism

Entrance survey

Homeworks

Homework DRAFTS

Submission details:

Homework 1(here) due 10 Feb 2020. due date changed via piazza announcement to 17 Feb 23h59
Homework 2(here) due 24 Feb 2020.Note date change!
Homework 3(here) due 2 Mar 2020.
Homework 4(here) due 30 Mar 2020.
Homework 5(here) due 13 Apr 2020.
I've created some confusion with this homework, for which I apologize. In (a) you should observe that the cluster centers drift together. This happens because at this scale, all the pixels are essentially one cluster; the fact that they're not only becomes apparent when you do (c). So the expected behavior is: for (a), the centers drift together, and the maps in (b) are all pretty much the same; for (c) the centers do not drift, and the maps are rather different. Apologies: I forgot some details. AAARGH! I got a constant wrong in (c). You will find (c) does not drift when you use a covariance of (1/400)*Identity rather than (1/10)*Identity. This is equivalent to pixels on a scale of 0-20. If you've already submitted, no worries; if you haven't, you'll find the whole thing a great deal less puzzling if you use (1/400)*Identity. SORRY!
Homework 6(here) due 20 Apr 2020.
Homework 7(here) due 27 Apr 2020.

AAARGH! I got a constant wrong here. Use 0.5 rather than 2 for the theta connecting the hidden state to the observation. If you've already handed in, no worries; if you haven't, you'll find the whole thing works a a lot better if you do this SORRY!

AAARGH! There is a typo on p 363 that may affect your ability to do this homework; it won't if you thrash through the math, but it'll save you time to know about it. About halfway down the page there are two equations. One reads a=e^(thing in parens) and the other reads b=e^(another thing in parens). The first should read a=(thing in parens) and the second should read b=(another thing in parens). SORRY!

Homework 8(here) due end of semester.
Extra Homework for Four Hour version(here) due end of semester.

Optional non graded homeworks

Regression
1. If you want to pick up some of the math, 10.6, 10.7, 10.8 are good places to start
2. If you want to do some practical stuff, 11.2, 11.3 and 11.4 are worth doing
Boosting
1. If you want to pick up some of the math, 12.3 (last one is hard), 12.4 are good places to start
2. If you want to do some practical stuff, any of the programming exercises, but you want to install XGBoost, with the multithreading option. I found this easy on Linux, and really quite exciting (but I got it to work) on a Mac
HMMs and Graphical Models
1. If you want to pick up some of the math, 13.1-3 are good places to start
2. If you want to do some practical stuff, any of the programming exercises, but 13.6 and 13.7 are a serious performance
3. If you want to pick up some structure learning, 14.2 is a place to start
Deep Learning
1. You will get tremendous value from being able to do 16.1-16.4
2. If you want to do some practical stuff, any one of 16.5-16.7 is a good place to start; all three is likely too much
3. I've already set 17.1 and 17.2. Doing 17.3, 17.4 is helpful. A really sensible thing to do is to try and make 17.4 work better by trying other tricks from chapter 18.
4. If you have a GPU, etc, 18.2, 18.3, 18.4 are a good idea

Syllabus:

I will start at the beginning of the textbook and proceed to the end, covering approximately one chapter per week. You'll notice there are 19 substantive chapters and 15 weeks; this is to allow a little spreading out, but in week N I expect to be close to chapter 15*N/19. Read the textbook. I wrote it specifically for this course, AND it's free. I will split time in lecture between sketching important points described in the text, and solving problems. If you haven't read the text, this might be quite puzzling.

Required Text:

Applied Machine Learning D.A. Forsyth, Springer, 2019

Important In the past, people have brought the pdf with them on mobile devices. I think this is a good idea. Or you could buy a paper copy. The PDF is a free download from the UIUC library (you have to be on the intranet to download it, I think)

Piazza link

TBA

I'm a video star! (or at least, I have been filmed)

you can see me here though you'll need to log in, and it may take a short while after class to be ready

Current movies

Class Date	PM-low-res
22-Jan	low-resClassification; Nearest neighbors
24-Jan	low-resNaive Bayes; SVMs
29-Jan	low-resLearning theory
31-Jan	low-resHigh dimensional data; Covariance matrices
5-Feb	low-resDiagonalizing covariance matrices; multivariate normals
7-Feb	low-resPrincipal components
12-Feb	low-resMore PCA; NIPALS
14-Feb	low-resLow rank models
19-Feb	low-rescanonical correlation analysis
21-Feb	We did not meet; movies in lieu on Announcements page
26-Feb	low-resVector quantization
28-Feb	low-resend VQ; start EM
4-Mar	low-resEM; start regression
6-Mar	low-resbasic least squares regression; R-squared; outliers;
11-Mar	low-res multivariate regression; using leverage, Cook's distance, standardized residuals; bias and variance
13-Mar	We did not meet; movies in lieu on Announcements page

Backup Material:

Probability and Statistics for Computer Science, D.A. Forsyth

Cover image for Probability and Statistics
for Computer Science

I can no longer release a PDF, as this has been published. The moire effect on the cover picture is the result of my scanner interacting with a shiny cover.

Code fragments I showed in class:

A naive bayes classifier on the Pima indians dataset; I averaged over 10 test train splits, and ignored examples with NA values; mainly interesting for simple code tricks. File here.
A naive bayes classifier on the Pima indians dataset; I averaged over 10 test train splits, but now I used examples with NA values both in train and test; mainly interesting for simple code tricks. File here.
A naive bayes classifier on the Pima indians dataset, using Klar and Caret; mainly interesting for simple code tricks. File here.
An SVM on the Pima indians dataset, using Klar and Caret and SVMLight; mainly interesting for simple code tricks. File here.
A much more elaborate SVM on the Pima indians dataset, using Klar and Caret and SVMLight. File here.

R resources:

An R Tutorial, prepared by Karthik Ramaswamy with revisions by Henry Lin
An introduction to R
Resources to help you learn and use R, from UCLA
R tutorials, from W.B. King, Coastal Carolina University
R code school, where you can get badges (?!?)