CS-498 Probability and Statistics for Computer Scientists

D.A. Forsyth --- 3310 Siebel Center

daf@uiuc.edu, daf@illinois.edu

MWF 1320 DCL, 09h00-09h50

TA's: Michael Sittig (sittig2@illinois.edu) and Nik Spirin (spirin2@illinois.edu)

Office Hours:

DAF: Monday, 10h00-11h00, Wednesday, 10h00-11h00, Friday, 10h00-11h00 (3310 Siebel)

Michael Sittig: Tuesdays, 14h00-15h00, Wednesdays, 17h00-18h00, Siebel Center downstairs TA office for 400 level classes

Nik Spirin: Thursdays, 17h00-18h00, Fridays, 11h00-12h00, Siebel Center downstairs TA office for 400 level classes



Homework 9: Due 8 Dec 2014


  1. Do problems 13.1 and 13.2 at the end of chapter 13 ("Learning to Classify") in the 21 Nov edition of the book
  2. Obtain the 50K dataset from the UC Irvine Machine Learning data repository (the dataset is this one). Drop all the categorical variables, and use R's Nearest Neighbor package to build a classifier that can tell whether an adult will earn more than or less than $50K. Use the package's test-train split machinery, and estimate the accuracy of your classifier. Look at the worked example at the start of chapter 15 for a simple example.




  1. I've no idea what this is doing here, and am having trouble figuring out how to remove it; sorry