CS-498 Applied Machine Learning - Homework 6
CS-498 Applied Machine Learning
D.A. Forsyth --- 3310 Siebel Center
15:30 - 16:45 OR 3.30 pm-4.45 pm, in old money
1320 Digital Computer Laboratory
Mariya Vasileva email@example.com
Sili Hui firstname.lastname@example.org
Daeyun Shin email@example.com
Ayush Jain firstname.lastname@example.org
Ayush Fri - 14h00-16h00 or 2-4 pm, location: in front of 3304
Daeyun Thu - 11h00-13h00 or 11 am-1 pm location: 0207 Siebel/p>
Mariya Wed - 15h00-17h00 or 3 - 5 pm location: 0207 Siebel
Sili Thur - 12h00-14h00 or 12 - 2 pm location: 0207 Siebel
DAF Mon - 14h00-15h00, Fri - 14h00-15h00
or swing by my office (3310 Siebel) and see if I'm busy
Evaluation is by: Homeworks and take home final.
I will shortly post a policy on collaboration and plagiarism
Homework 7: Due 11 April 2016 23h59 (Mon; midnight)
You should do this homework in groups of up to three; details of how to submit have been posted on piazza. You can use any programming language you
care to, but I think you'll prefer R because it has tools for this (lm and glmnet).
Details and description subject to minor changes
Regression of spatial data using kernel functions: Luke Spadavecchia, of the University of Edinburgh, has collected and is looking after a dataset of temperature measurements from 112 weather stations in Oregon. You can find this data here. The data consists of two files. One gives the location of each weather station, in a variety of units. You will use the UTM units. The other gives
maximum and minimum temperature for various days in the years 2000-2004. You will use kernel functions to build various linear regressions predicting the annual mean of the minimum temperature as a function of
position (i.e. you'll use one annual mean of minimum temperature for 2000, 2001, 2002, 2003, and 2004 at each weather station).
- Use a kernel method (section 9.1.1) to smooth this data, and predict the average annual temperature at each point on a 100x100 grid spanning the weather stations. You should use each data point as a base point, and you should search over a range of at least six scales, using cross-validation to choose the scale. Use a Gaussian kernel. Plot your prediction as an image. Compare to the Kriging result that you can find in figure 4 here .
- Regularize this kernel method (section 9.1.1; worked example 9.1; associated code) using the lasso, and predict the average annual temperature at each point on a 100x100 grid spanning the weather stations. You should choose the regularization constant using cross validation (cv.glmnet will do the work for you; read the manual!). You should use each data point as a base point, and you use a range of at least six scales. Plot your prediction as an image. Compare to the Kriging result that you can find in figure 4 here . How many predictors does your model use? How does prediction error change with the number of predictors?
- Now investigate the effect of different choices of elastic net constant (alpha)