daf@uiuc.edu, daf@illinois.edu
MWF 1320 DCL, 09h00-09h50
TA's: Saurabh Singh (ss1@illinois.edu) and Daphne Tsatsoulis (tsatsou2@illinois.edu)
Office Hours:
DAF: Monday, 10h00-11h00, Wednesday, 10h00-11h00, Friday, 10h00-11h00 (3310 Siebel)
Saurabh: Monday 14h30-15h30, Wednesday 14h30-15h30 (Siebel Center downstairs TA office for 400 level classes)
Daphne: Monday, 13h00-15h00, Friday, 14h00-15h00 (Siebel Center downstairs TA office for 400 level classes) NOTE NEW HOURS!
This homework can be done in pairs. It is more elaborate than previous homeworks, and is intended to be somewhat open ended. A good submission will include figures like those in the text: examples of the cluster centers; examples of the histograms; examples of the signals; and a class confusion matrix. You can do this homework in relatively few lines of R (my version ran to about 30 lines), but it takes some thought to know which ones.
Here's a useful fragment of R, which I'll talk about in class:
setwd('/users/daf/Current/courses/Probcourse/Clustering/RCode')
wdat<-read.csv('turkiye-student-evaluation_R_Specific.csv')
wdm<-wdat[c(1:20),c(7:9)] # choose the questions
wdm2 <- matrix(as.vector(t(as.matrix(wdm))), nrow = 5, byrow = TRUE)
foo<-kmeans((wdm2), 2)
closest<-knn(foo$centers, wdm2, c(1:2))
wdm2<-as.data.frame(wdm2)
wdm2$type<-closest
bar<-subset(wdm2, (wdm2$type==1))