/******************************************************************* CHAPTER 8, EXAMPLE 3 Analysis of the hip replacement data using a general regression model in time and age - the repeated measurement factor is time (weeks) - there is one "treatment" factor, gender (0=female, 1 = male) - an additional covariate, age, is also available - the response is haematocrit We use the REPEATED statement of PROC MIXED with the TYPE= options to fit the model assuming different covariate structures. These data are unbalanced both in the sense that some patients were not observed at all times. *******************************************************************/ options ls=80 ps=59 nodate; run; /****************************************************************** Read in the data set Define variable 'WEEK2 = WEEK^2' Define variable 'TIME = WEEK' *******************************************************************/ data hips; infile "hips.dat"; input patient gender age week h; week2=week*week; time=week; run; proc print data=hips; run; /***************************************************************** DATA The SAS System 1 Obs patient gender age week h week2 time 1 1 1 66 0 47.10 0 0 2 1 1 66 1 31.05 1 1 3 1 1 66 3 32.80 9 3 4 2 1 70 0 44.10 0 0 5 2 1 70 1 31.50 1 1 6 2 1 70 3 37.00 9 3 7 3 1 44 0 39.70 0 0 8 3 1 44 1 33.70 1 1 9 3 1 44 3 24.50 9 3 10 4 1 70 0 43.30 0 0 11 4 1 70 1 18.35 1 1 12 4 1 70 3 36.60 9 3 13 5 1 74 0 37.40 0 0 14 5 1 74 1 32.25 1 1 15 5 1 74 3 29.05 9 3 16 6 1 65 0 45.70 0 0 17 6 1 65 1 35.50 1 1 18 6 1 65 3 39.80 9 3 19 7 1 54 0 44.90 0 0 20 7 1 54 1 34.10 1 1 21 7 1 54 3 32.05 9 3 22 8 1 63 0 42.90 0 0 23 8 1 63 1 32.05 1 1 24 9 1 71 0 46.05 0 0 25 9 1 71 1 28.80 1 1 26 9 1 71 3 37.80 9 3 27 10 1 68 0 42.10 0 0 28 10 1 68 1 34.40 1 1 29 10 1 68 2 34.00 4 2 30 10 1 68 3 36.05 9 3 31 11 1 69 0 38.25 0 0 32 11 1 69 1 29.40 1 1 33 11 1 69 2 32.85 4 2 34 11 1 69 3 30.50 9 3 ----- ************************************************************************/ /******************************************************************* Fit the straight line model assuming that the covariance structure of a data vector is diagonal with constant variance; i.e. using ordinary least squares. We use PROC GLM with the SOLUTION and NOINT options to fit the three separate intercepts/slopes parameterization. *******************************************************************/ title "FIT USING ORDINARY LEAST SQUARES"; proc glm data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* ## Our MODEL: ## ## males: Yi(t_ij) = beta1 + beta2 t_ij + beta3 (t_ij)^2 + beta7 age_i +EPS_ij ## females: Yi(t_ij) = beta4 + beta5 t_ij + beta6 (t_ij)^2 + beta7 age_i +EPS_ij ## ## OR equivalently ## ## Yi(t_ij) = beta1 I(Gender_i=MALE) + beta4 I(Gender_i= FEMALE) ## + beta2 I(Gender_i=MALE) t_ij + beta5 I(Gender_i= FEMALE) t_ij ## + beta3 I(Gender_i=MALE) (t_ij)^2 + beta6 I(Gender_i= FEMALE) (t_ij)^2 ## + beta7 age_i ## + EPS_ij ## OVERALL ANOVA FIT USING ORDINARY LEAST SQUARES 6 The GLM Procedure Dependent Variable: h Sum of Source DF Squares Mean Square F Value Pr > F Model 7 117876.6500 16839.5214 923.93 <.0001 Error 92 1676.7875 18.2260 Uncorrected Total 99 119553.4375 R-Square Coeff Var Root MSE h Mean 0.492463 12.45853 4.269186 34.26717 --------------------------------------------------------------------- # Type I SS (recall these are sequential SS, obtained using a 'forward' technique) # At each step the SS corresponds to a model only for everything that is before # Eg: read 2nd line: SS for accounts ONLY for the information # read 4th line: SS for accounts for the , AND Source DF Type I SS Mean Square F Value Pr > F gender 2 116335.3869 58167.6935 3191.48 <.0001 week*gender 2 576.5998 288.2999 15.82 <.0001 week2*gender 2 956.3305 478.1652 26.24 <.0001 age 1 8.3328 8.3328 0.46 0.5006 # Type III SS (recall these are separate SS, which account for everything else). # # Eg: read 2nd line: SS for accounts for , AND # read 4th line: SS for accounts for , AND # Note the type I and III SS for coincide !!! Source DF Type III SS Mean Square F Value Pr > F gender 2 2751.501435 1375.750718 75.48 <.0001 week*gender 2 1299.352485 649.676243 35.65 <.0001 week2*gender 2 955.412729 477.706365 26.21 <.0001 age 1 8.332762 8.332762 0.46 0.5006 # OLS estimates of BETA: # These estimates use an independence assumption - very unlikely with longitudinal data # CONSEQUENCE: standard error of the estimators are misleading ! Parameter Estimate Error t Value Pr > |t| gender 0 36.15477261 3.32710583 10.87 <.0001 gender 1 40.12592025 3.26578601 12.29 <.0001 week*gender 0 -9.51744511 1.84332569 -5.16 <.0001 week*gender 1 -14.28704343 2.14023390 -6.68 <.0001 week2*gender 0 2.56534826 0.57617846 4.45 <.0001 week2*gender 1 3.84374965 0.67355325 5.71 <.0001 age 0.03169663 0.04687742 0.68 0.5006 *******************************************************************/ /********************************************************************** FINAL REMARKS about the OLS estimates of the mean parameters BETA: When de design is balanced (same timepoints across the subjects), the OLS estimates of BETA coincide with the GLS estimates, under the CS covariance assumption. Because these data are not balanced, the OLS estimates of the mean parameters, beta, are different from the GLS estimates with CS covariance structure. **********************************************************************/ /******************************************************************* Use PROC MIXED to fit the general quadratic regression model with assumptions about the covariance matrix of a data vector. The SOLUTION option in the MODEL statement requests that the estimates of the regression parameters be printed. The R option in the REPEATED statement as used here requests that the covariance matrix estimate be printed in matrix form. Here, because the data have unequal numbers of observations, we ask to see the matrices for 2 individuals with different numbers. Similarly for the RCORR option, which prints the corresponding correlation matrix. With the ar(1) and one-dependent structures, we have to be careful to communicate to PROC MIXED the fact that the data are imbalanced in the sense that the times are all the same for all patients, but some patients are not observed at some of the times. In our mean model, we want WEEK, the time factor, to be continuous; however, PROC MIXED needs also for the time factor to be a classification factor so that it can properly figure out the missingness pattern. We give it this information by defining TIME = WEEK and letting TIME be a classification factor in the REPEATED statement. *******************************************************************/ * unstructured; title "FIT WITH UNSTRUCTURED COMMON COVARIANCE"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq; repeated time / type = un subject=patient r= 1,10 rcorr=1,10; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* FIT WITH UNSTRUCTURED COMMON COVARIANCE 12 The Mixed Procedure Model Information Data Set WORK.HIPS Dependent Variable h Covariance Structure Unstructured Subject Effect patient Estimation Method REML Residual Variance Method None Fixed Effects SE Method Model-Based Degrees of Freedom Method Between-Within Class Level Information Class Levels Values patient 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 time 4 0 1 2 3 gender 2 0 1 Dimensions Covariance Parameters 10 Columns in X 7 Columns in Z 0 Subjects 30 Max Obs Per Subject 4 Number of Observations Number of Observations Read 99 Number of Observations Used 99 Number of Observations Not Used 0 # # NOT PRINTED: # 1. Iteration history (REML is used to fit the covariance parameters, and the number of iterations # to convergence is printed) # 2. Convergence status # We print the estimated covariances/correlations for 2 subjects (one indexed 1, and another indexed 10) Estimated R Matrix for patient 1 Row Col1 Col2 Col3 1 18.0680 4.6364 5.0947 2 4.6364 16.5021 0.4870 3 5.0947 0.4870 19.2076 Estimated R Correlation Matrix for patient 1 Row Col1 Col2 Col3 1 1.0000 0.2685 0.2735 2 0.2685 1.0000 0.02735 3 0.2735 0.02735 1.0000 Estimated R Matrix for patient 10 Row Col1 Col2 Col3 Col4 1 18.0680 4.6364 -13.9213 5.0947 2 4.6364 16.5021 2.8483 0.4870 3 -13.9213 2.8483 67.8805 25.1818 4 5.0947 0.4870 25.1818 19.2076 Estimated R Correlation Matrix for patient 10 Row Col1 Col2 Col3 Col4 1 1.0000 0.2685 -0.3975 0.2735 2 0.2685 1.0000 0.08510 0.02735 3 -0.3975 0.08510 1.0000 0.6974 4 0.2735 0.02735 0.6974 1.0000 # Covariance Parameter Estimates [recall an unstructured covariance matrix was assumed. This entails 10 parameters] Cov Parm Subject Estimate UN(1,1) patient 18.0680 UN(2,1) patient 4.6364 UN(2,2) patient 16.5021 UN(3,1) patient -13.9213 UN(3,2) patient 2.8483 UN(3,3) patient 67.8805 UN(4,1) patient 5.0947 UN(4,2) patient 0.4870 UN(4,3) patient 25.1818 UN(4,4) patient 19.2076 ## Fit Statistics [fitted likelihood; and 3 versions of penalized likelihoods (AIC, AICC , BIC)] -2 Res Log Likelihood 544.5 AIC (smaller is better) 564.5 AICC (smaller is better) 567.2 BIC (smaller is better) 578.5 ## Null Model Likelihood Ratio Test for testing a independence covariance structure for the data vs ## an unstrucred covariance structure, but common in the 2 gender groups ## Formally written this hypothesis test is ## H0: Sigma = sigma^2 Indentity vs H1: Sigma = UNSTRUCTURED ## ## Note: the covariance specified in H0 is parameterized by 1 PARAMTER ## the covariance specified in H1 is parameterized by 10 PARAMTERS ## ## The test used is LRT, which has Chi-square dist'n, with (10-1)=9 DEGREES OF FREEDOM DF Chi-Square Pr > ChiSq 9 16.60 0.0554 # GLS estimates of BETA, when COMMON unstructured covariance structure is assumed Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 42.2823 3.1835 28 13.28 <.0001 gender 1 45.5650 3.1116 28 14.64 <.0001 week*gender 0 -11.4526 1.8018 28 -6.36 <.0001 week*gender 1 -15.8799 2.0222 28 -7.85 <.0001 week2*gender 0 2.9269 0.5640 28 5.19 <.0001 week2*gender 1 4.2369 0.6368 28 6.65 <.0001 age -0.04330 0.04465 28 -0.97 0.3405 *******************************************************************/ * PROC MIXED with compound symmetry; title "FIT WITH COMMON COMPOUND SYMMETRY"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq; repeated time / type = cs subject=patient rcorr=1,10; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* ## Fitted likelihoods and penalized likelihoods Fit Statistics -2 Res Log Likelihood 556.7 AIC (smaller is better) 560.7 AICC (smaller is better) 560.8 BIC (smaller is better) 563.5 # GLS estimates of BETA, when COMMON compound symmetric covariance structure is assumed. # # Note these estimates are different from the GLS estimates with unstructured covariance. # However, we note that these estimates are similar across methods. Because these are # longitudinal data, however, the estimates that are based on a model that take into account the # likely correlation among observations within the same unit is more credible, and the tests and # standard errors derived from such a model are more reliable. Solution for Fixed Effects Effect gender Estimate Error DF t Value Pr > |t| gender 0 35.7027 3.8826 28 9.20 <.0001 gender 1 39.6756 3.8088 28 10.42 <.0001 week*gender 0 -9.5954 1.6604 64 -5.78 <.0001 week*gender 1 -14.2653 1.9229 64 -7.42 <.0001 week2*gender 0 2.5899 0.5180 64 5.00 <.0001 week2*gender 1 3.8392 0.6046 64 6.35 <.0001 age 0.03853 0.05562 64 0.69 0.4910 *******************************************************************/ * ar(1); title "FIT WITH COMMON AR(1) STRUCTURE"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq; repeated time / type = ar(1) subject=patient rcorr=1,10,15; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Cov Parm Subject Estimate AR(1) patient 0.2910 Residual 18.3070 ## Fitted likelihoods and penalized likelihoods -2 Res Log Likelihood 556.5 AIC (smaller is better) 560.5 AICC (smaller is better) 560.6 BIC (smaller is better) 563.3 # GLS estimates of BETA, when COMMON AR(1) covariance structure is assumed. # Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 35.8838 3.7661 28 9.53 <.0001 gender 1 39.8949 3.6947 28 10.80 <.0001 week*gender 0 -9.8043 1.6356 64 -5.99 <.0001 week*gender 1 -14.6020 1.8736 64 -7.79 <.0001 week2*gender 0 2.6313 0.5094 64 5.17 <.0001 week2*gender 1 3.9150 0.5904 64 6.63 <.0001 age 0.03749 0.05369 64 0.70 0.4875 *******************************************************************/ * Markov; title "FIT WITH MARKOV STRUCTURE"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq covb; repeated / type = sp(pow)(week) subject=patient rcorr=1,10; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Covariance Parameter Estimates Cov Parm Subject Estimate SP(POW) patient 0.2910 Residual 18.3070 ## Fitted likelihoods and penalized likelihoods -2 Res Log Likelihood 556.5 AIC (smaller is better) 560.5 AICC (smaller is better) 560.6 BIC (smaller is better) 563.3 # GLS estimates of BETA, when COMMON MARKOV covariance structure is assumed. # Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 35.8838 3.7661 28 9.53 <.0001 gender 1 39.8949 3.6947 28 10.80 <.0001 week*gender 0 -9.8043 1.6356 64 -5.99 <.0001 week*gender 1 -14.6020 1.8736 64 -7.79 <.0001 week2*gender 0 2.6313 0.5094 64 5.17 <.0001 week2*gender 1 3.9150 0.5904 64 6.63 <.0001 age 0.03749 0.05369 64 0.70 0.4875 ## Notice the GLS estimates of beta with Markov cov structure are identical to the GLM estimates when ## AR(1) covariance structure is assumed, because the repeated measures are taken at integer-valued times ## 0,1,2,3 *******************************************************************/ * one-dependent; * and show use of CONTRAST statement; title "FIT WITH COMMON ONE-DEPENDENT STRUCTURE"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq covb; repeated time / type = toep(2) subject=patient rcorr=1,10; contrast 'f vs m, wk 3' gender 1 -1 gender*week 3 -3 gender*week2 9 -9 /chisq; estimate 'f vs m, wk 3' gender 1 -1 gender*week 3 -3 gender*week2 9 -9; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* ## Fitted likelihoods and penalized likelihoods Fit Statistics -2 Res Log Likelihood 556.1 AIC (smaller is better) 560.1 AICC (smaller is better) 560.3 BIC (smaller is better) 562.9 # GLS estimates of BETA, when COMMON ONE-DEPENDENT covariance structure is assumed. # Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 36.2941 3.7164 28 9.77 <.0001 gender 1 40.2860 3.6474 28 11.05 <.0001 week*gender 0 -9.9910 1.6592 64 -6.02 <.0001 week*gender 1 -14.8308 1.8879 64 -7.86 <.0001 week2*gender 0 2.6610 0.5222 64 5.10 <.0001 week2*gender 1 3.9601 0.6025 64 6.57 <.0001 age 0.03354 0.05284 64 0.63 0.5279 ## The estimated covariance of the GLS estimates of BETA =(beta1, beta2, ..., beta7). ## This is 7x7 matrix, with the diagonal giving the variance of the parameter estimates Covariance Matrix for Fixed Effects Row Effect gender Col1 Col2 Col3 Col4 Col5 1 gender 0 13.8117 12.2645 -1.4234 0.05482 0.3004 2 gender 1 12.2645 13.3033 -0.4160 -1.1484 0.09378 3 week*gender 0 -1.4234 -0.4160 2.7531 -0.00186 -0.8263 4 week*gender 1 0.05482 -1.1484 -0.00186 3.5640 0.000419 5 week2*gender 0 0.3004 0.09378 -0.8263 0.000419 0.2727 6 week2*gender 1 -0.01425 0.2285 0.000483 -1.0835 -0.00011 7 age -0.1880 -0.1821 0.006377 -0.00081 -0.00144 Covariance Matrix for Fixed Effects Row Col6 Col7 1 -0.01425 -0.1880 2 0.2285 -0.1821 3 0.000483 0.006377 4 -1.0835 -0.00081 5 -0.00011 -0.00144 6 0.3630 0.000212 7 0.000212 0.002792 ## Type III SS (recall these are separate SS, which account for everything else). ## ## Eg: read 2nd line: SS for accounts for , AND ## read 4th line: SS for accounts for , AND ## Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F gender 2 28 122.28 61.14 <.0001 <.0001 week*gender 2 64 98.03 49.01 <.0001 <.0001 week2*gender 2 64 69.19 34.60 <.0001 <.0001 age 1 64 0.40 0.40 0.5257 0.5279 ## Estimation of difference in mean response between males and females at week 3 (i.e time t_ij = 3). ## Recall our MODEL: ## ## males: Yi(t_ij) = beta1 + beta2 t_ij + beta3 (t_ij)^2 + beta7 age_i +EPS_ij ## females: Yi(t_ij) = beta4 + beta5 t_ij + beta6 (t_ij)^2 + beta7 age_i +EPS_ij ## ## beta = (beta1, beta2, beta3, beta4, beta5, beta6, beta7) ## define L for contrast: L = (1 3 9 -1 -3 -9 0) ## ## Thus: 'difference in mean response between males and females at week 3' = L beta ## ## ESTIMATION. Estimate by L beta.hat (beta.hat = GLS estimate of beta) ## ## CONTRAST. Test H0: L beta=0 using the Wald statistic, T_L ## ## The Wald statistic - of the form estimate divided by standard error - is given in the result of the estimate ## statement and is equal to -0.72 (SEE below). ## PROC MIXED compares this to a t distribution; alternatively, a normal distribution could be used. ## The contrast statement with the chisq option produces the identical test, ## but printing the statistic TL = 0.52 = (-0.72)^2 instead (SEE below). ## FIT WITH COMMON ONE-DEPENDENT STRUCTURE The Mixed Procedure Estimates Standard Label Estimate Error DF t Value Pr > |t| f vs m, wk 3 -1.1649 1.6223 64 -0.72 0.4753 Contrasts Num Den Label DF DF Chi-Square F Value Pr > ChiSq Pr > F f vs m, wk 3 1 64 0.52 0.52 0.4727 0.4753 *******************************************************************/ * WISH MORE COMPLEX MODELS (eg. age has different effect for MALES/FEMALES)? NO PROBLEM; * Change STATEMENT to specify the model structure. A model with different age effect for M/F reads like; * model h = gender gender*age gender*week gender*week2 / noint solution chisq covb; /****************************************************************************************** ## The gender part of each term ensures that the model includes different such terms for males and females. ## ## OR MODEL is written mathematically as: ## ## Yi(t_ij) = beta1 I(Gender_i=MALE) + beta4 I(Gender_i= FEMALE) ## + beta2 I(Gender_i=MALE) t_ij + beta5 I(Gender_i= FEMALE) t_ij ## + beta3 I(Gender_i=MALE) (t_ij)^2 + beta6 I(Gender_i= FEMALE) (t_ij)^2 ## + beta7 I(Gender_i=MALE) age_i + beta8 I(Gender_i= FEMALE) age_i ## + EPS_ij ## ******************************************************************************************/ * one-dependent; * and CONTRAST statement ; title "FIT WITH COMMON ONE-DEPENDENT STRUCTURE"; proc mixed data=hips; class patient time gender; model h = gender gender*week gender*week2 age / noint solution chisq covb; repeated time / type = toep(2) subject=patient rcorr=1,10; contrast 'same rates of changes for f and m' gender 0 0 gender*week 1 -1 gender*week2 0 0 age 0, gender 0 0 gender*week 0 0 gender*week2 1 -1 age 0/chisq; run; /******************************************************************* ## The null hypothesis of same rates of changes for males and females (same slope + coeff of the quadratic term in ## both groups) may be written as ## H0 : Lbeta =0, where ## L = | 0 1 0 0 -1 0 0 | ## | 0 0 1 0 0 -1 0 | ## Contrasts Num Den Label DF DF Chi-Square F Value Pr > ChiSq Pr > F same rates of changes for f and m 2 64 4.17 2.08 0.1246 0.1330 *******************************************************************/