CS-498 Probability and Statistics for Computer Scientists

D.A. Forsyth --- 3310 Siebel Center

daf@uiuc.edu, daf@illinois.edu

MWF 1320 DCL, 09h00-09h50

TA's: Michael Sittig (sittig2@illinois.edu) and Nik Spirin (spirin2@illinois.edu)

Office Hours:

DAF: Monday, 10h00-11h00, Wednesday, 10h00-11h00, Friday, 10h00-11h00 (3310 Siebel)

Michael Sittig: Tuesdays, 14h00-15h00, Wednesdays, 17h00-18h00, Siebel Center downstairs TA office for 400 level classes

Nik Spirin: Thursdays, 17h00-18h00, Fridays, 11h00-12h00, Siebel Center downstairs TA office for 400 level classes

 

 

Homework 10: Due 15 Dec 2014

 

  1. Obtain the Kittiwake dataset (this one). This gives the population of Kittiwakes (a kind of sea bird) and the area of the colony. We are interested in the relationship between area and population. Use R's lm function (sample code available in the book) to compute four linear regressions: First, predicting population from Area; second, predicting log(population) from Area; third, predicting log(population) from log(area); and fourth, predicting population from log(area). For each regression, prepare a figure that shows a scatter plot of the actual data points, and also the predictions made by the regression (i.e. at each x-value, there should be two y-values --- the true value, and the prediction of the regression). Which of these four models do you trust the most, and why?

 

 

 

  1. A kind person showed me how to remove this, but now I quite like it here.