ST790R -- Fall 2009 Homework #3 -- due Thursday, 24 September 2009 (** just turn in 5.13(in text, as modified), and 4, 5, 6 below) *5.13 *modified* Compare either the algorithm in 5.11 (code with a loop) or (2.5.1)(*correct it*) with the 'usual' calculations of SUM(x(i)^2) - (SUM(x(i)))^2/N, etc. and with R's 'var' function. Use x(i) = (2^k) + i, for i=1, ..., 48 and k=12, 25, 27, 30. Just compare the variance calculations. 0) Centering, or subtracting the mean from a covariate, is a common practice in regression, and it has its advantages in computation. Writing Xb = X*S*inv(S)*b = Wc where X and W have the same column space, the 2nd through p-th cols of W are centered cols of X, write the p*p matrix S, and construct its inverse. ( W = X*S and c = inv(S)*b ) R Exercises:(from Homework #2) 1) Let y = (225, 215, 209, 175, 163, 135, 153, 125)' and let x = (37, 41, 41, 45, 45, 49, 49, 53)' be the covariate of a simple linear regression model, where the first column of the design matrix X is one, and the second column is x. Compute y'Pone y, y'(Px-Pone)y, and y'(I-Px)y, where Pone is a matrix with each element equal to 1/n (here n=8), and Px is X*inv(X'X)X' the usual symmetric projection matrix. *** this time, do this using the qr() function *** 2) Read in the Cork Data from the file 'cork.dat' in the course 'rfiles' directory. My old-fashioned code is Cork <- t( matrix( scan("cork.dat"), 4, 28) ) which will make a 28x4 matrix. 3) Compute the sample covariance matrix using var(Cork) and call this matrix V *4) Partition the matrix V into the first two and last two rows/ columns and compute V11 - V12*inv(V22)*V21 *** For Homework #2, I wanted you to use chol() and forwardsolve(), (and that's all!) so redo (4). *** this time, do this using either qr() function or sweep *** (you can use the original matrix 'Cork' instead of V) *5) (Variation on 5.31) a) Write the ridge regression estimator as the solution to a linear least squares problem by adding observations. b) In the 'rfiles' directory find a file 'longley.dat' with the response (y) in the first column and six explanatory variables in the other columns. Include an intercept in your model and compute the ridge regression estimates for two different values of lambda using two methods -- one orthogonalization method and one that is not (e.g. Cholesky, sweep). *6) Another variation on Cholesky starts at the lower right hand corner and constructs an upper triangular matrix R such that a positive definite matrix A can be factored as A = RR'. Show the algebra for the induction step and apply this method to the 3x3 matrix below: 9 2 -2 2 1 0 -2 0 4