D.A. Forsyth --- 3310 Siebel Center

The interior of a sunken 727, in low light.

TA's:

 

 

Homework 6: Due 26 Mar 2018 23h59 (Mon; midnight)

I have been pressured to slip this, and have succumbed to this temptation; due date is now 2 April 2018 However, I wouldn't rely on this working every time

 

You may do this homework in groups of up to 3 contributors. Groups of 1 or of 2 are just fine, too. A group can consist of any mixture of any type of student (MSCS-DS/Online/Face). We do not offer coordination services for complex group interactions, and you may want to take this into account when forming your group.

Submission: Homework 6 submission details as for earlier homeworks.

Be careful about your choice of programming environment for this homework. You must use glmnet (the python lasso tool has given trouble in the past). I have had real problems with glmnet in Matlab. I very strongly advise using glmnet in R.

 

  1. Linear regression with various regularizers The UCI Machine Learning dataset repository hosts a dataset giving features of music, and the latitude and longitude from which that music originates here. Investigate methods to predict latitude and longitude from these features, as below. There are actually two versions of this dataset. Either one is OK by me, but I think you'll find the one with more independent variables more interesting. You should ignore outliers (by this I mean you should ignore the whole question; do not try to deal with them). You should regard latitude and longitude as entirely independent.
  2. Logistic regression The UCI Machine Learning dataset repository hosts a dataset giving whether a Taiwanese credit card user defaults against a variety of features here. Use logistic regression to predict whether the user defaults. You should ignore outliers, but you should try the various regularization schemes we have discussed.