homework 2 St 708
 

Homework 2 St 708

/* ---------------------------------------Due Sep. 9 2003----------- | This exercise continues to review linear regression, illustrates | | projection ideas, and gives practice with more IML matrix tools. | --------------------------------------------------------------------


Questions 1. The SAS IML function DET(M) returns the determinant of M. Using this function, compute the determinants of the matrices B and C below: 1 3 2 5 1 3 1 5 1 2 5 3 C = 1 2 0 3 B = 1 -1 2 2 1 -1 2 2 5 2 -1 4 5 2 -6 1 What does this information tell you about these matrices? The statement D = inv(A); returns D as the inverse of matrix A. Use this to compute the inverses of B and C if possible. If either B or C is not full rank, find a nontrivial column vector V such that BV (or CV) is a column of 0s. Is there ANOTHER nontrivial V that will work and which is not just a multiple of your first V? How do you know? ("trivial" means all 0 entries). As a review of some matrix operations, pick out the middle 2 columns of B and stack them on top of the middle 2 columns of C to get a new 8x2 matrix. Output this to a dataset and plot the first column against the second. Class demo programs IML_commands.sas and MAT2DAT.sas should have most of the commands you'll need. Submit the plot and the IML code you used to do the job. 2. Find, if possible, values a,b,c and d such that all four of these equations are satisfied: a + 3b + 2c + 5d = 10 a + 2b + 5c + 3d = 3 a - b + 2c + 2d = -1 5a + 2b - c + 4d = 11 Explain how knowing the inverse of matrix B from problem 1 can help you in solving these equations and show how to program the solution in SAS PROC IML. 3. Fill in the missing entry of matrix D in such a way that D has no inverse. How many right answers are there to this problem? 7 4 6 D = 2 2 12 4 5 __ -1 4. Compute the matrix P = X (X'X) X' for the matrix 1 -2 1 -1 X = 1 0 1 1 1 2 Check to see if P is idempotent. Show how we can use a quick method from the book (or class lecture) to find the rank of P without knowing its relationship to X. [Note: Because we know how P and X are related, we see that every column of P is a linear combination of the two X columns so we know from that fact that rank(P) = 2 and we know that P is idempotent. I am asking you to check these as though you did not know about X.] Write down any column vector with 5 entries, calling the vector Y. Compute PY, PPY, PPPY, and PPPPY. Are the elements of PY "equally spaced" (in other words is the difference between the first and second entry the same as the difference between the second and third, third and fourth, etc. ) If I run a simple linear regression (with intercept) of your column of 5 original Y values on the intercept column and a column X =(1 2 3 4 5)', take the residuals R, and then project them with P, what will be the resulting PR? (hint: You can do the computation but you should know without doing it). 5. Show that the matrix X below is a projection matrix. | .5 0 0 0 .5 | | 0 .5 0 .5 0 | X = | 0 0 1 0 0 | | 0 .5 0 .5 0 | | .5 0 0 0 .5 | Compute the projections of the vectors V1 = {1 1 1 1 1 }' and V2 = {1 3 3 3 1}' and V3 = {1 2 3 4 1}'. Find, if possible, scalar constants a and b such that the projection of the vector aV1 + bV2 is not just aV1 + bV2. Is the space spanned by V1 and V2 equal to the space into which X projects vectors? Is it a subspace? How do you know? Is the space spanned by V1 and V3 equal to the space into which X projects vectors? Is it a subspace? How do you know? 6. Here are some data from problem 1.4 page 30 and a simple PROC REG in SAS. Y is resting heart rate and X is body weight in kilograms: DATA HEART; INPUT X Y; CARDS; 90 62 86 45 67 40 89 55 81 64 75 53 ; PROC GPLOT; PLOT Y*X; symbol1 v=dot c=red i=RL; PROC REG; MODEL Y=X/P R XPX I COVB; ******************************************** ** note how PROC REG borders the X'X matrix with X'Y, Y'Y etc. and simliarly b and SSE form a border for the inverse of X'X This is part of the sweep operator. Look at the log window where the equation of the line in the plot is shown. *******************************************; RUN; *** Now we try this in IML ***; PROC IML; reset spaces=5; X = { 1 90, 1 86, 1 67, 1 89, 1 81, 1 75}; Y = {62, 45, 40, 55, 64, 53} ; XPX=X`*X; XPY=X`*Y; IXPX =inv(XPX); b= ; print X Y XPX XPY; ident = I(6); print IXPX b ident; questions: (a) Add the computation of the estimated parameter vector b to the program and verify that vector b gives the regression coefficients. (b) The error sum of squares is Y`Y - b`X`Y in matrix form and it has n-2 degrees of freedom. Add the computation of the error MEAN square MSE to the IML program. (c) Verify that if you multiply the inverse X'X matrix, IXPX, by MSE you get the COVB matrix from PROC REG. (d) Verify that if you take the square roots of the diagonal elements in COVB you will get the "standard errors" delivered by PROC REG. (e) Compute, in IML, the projection matrix P for this regression problem. Compute PY and verify that this delivers the same predicted values as PROC REG. (f) Compute the matrix I-P for this regression and show that (I-P)Y is the vector of residuals delivered by PROC REG. Also compute and print Y'(I-P)Y. Does this appear on your PROC REG output somewhere? 7. Optional (not graded) : For the matrix A below, write out a degree 4 polynomial whose roots are the eigenvalues of A. A = {4 0 2 1, 0 4 1 0, 2 1 6 2, 1 0 2 2}; I hope you will do this by hand, but the program below can be used to check your answer (and it contains some interesting SAS code). Note: In the following program, the determinant function is used to compute f(L) = | A - L*I | for a sequence of Ls. Note that I(4) is a 4x4 identity matrix and note the method for passing matrices into SAS datasets. Also V must be initialized so it exists within the DO loop, hence the shape function. Next the program uses PROC GLM to find the polynomial equation for f(L) Hence this can be used to check your algebra. ===================================================================== PROC IML; A = {4 0 2 1, 0 4 1 0, 2 1 6 2, 1 0 2 2}; V = shape(0,8,2); ** Dummy matrix of 0; do L = 1 to 8; V[L,1]=L; V[L,2]= det(A - L*I(4)); ** I(4) = identity; end; ** note {} matrices [] options or elements () functions; create data1 from V [colname={L FL} ]; append from V; ** read V into dataset; proc glm; model FL= L L*L L*L*L L*L*L*L L*L*L*L*L; run; ===================================================================== Some IML functions: C = Det(M) (scalar C is determinant of M) V = shape(0,8,2) (V is an 8x2 matrix of 0s) create & append (form a dataset from a matrix - colname matrix contains variable names) ID = I(4) (ID is a 4x4 identity matrix) B = inv(A) (B is inverse of A) + - * (add subtract multiply) C = A*B (ordinary product OR scalar product if A or B scalar) (can also have A +/- B with B scalar)