/******************************************************************* CHAPTER 8, EXAMPLE 1 Analysis of the dental study data by fitting a general linear regression model in time and gender structures using PROC MIXED. - the repeated measurement factor is age (time) - there is one "treatment" factor, gender For each gender, the "full" mean model is a straight line in time. We use the REPEATED statement of PROC MIXED with the TYPE= options to fit the model assuming several different covariance structures. *******************************************************************/ options ls=80 ps=59 nodate; run; /****************************************************************** Read in the data set (See Example 1 of Chapter 4) *******************************************************************/ data dent1; infile "dental.txt"; input obsno child age distance gender; ag = age*gender; run; /******************************************************************* Sort the data so we can do gender-by-gender fits. *******************************************************************/ proc sort data=dent1; by gender; run; proc print data=dent1; run; /******************************************************************* ******************************************************************** PROC MIXED ******************************************************************** ******************************************************************* /******************************************************************* Now use PROC MIXED to fit the more general regression model with assumptions about the covariance matrix of a data vector. For all of the fits, we use usual normal maximum likelihood (ML) rather than restricted maximum likelihood (REML), which is the default. We do this for each gender separately first using the unstructured assumption. The main goal is to get insight into whether it might be the case that the covariance matrix is different for each gender (e.g. variation is different for each). The SOLUTION option in the MODEL statement requests that the estimates of the regression parameters be printed. The R option in the REPEATED statement as used here requests that the covariance matrix estimate be printed in matrix form. The RCORR option requests that the corresponding correlation matrix be printed. *******************************************************************/ * unstructured covariance matrix for each group; * separate analysis requested by the 'by' statement; title "FIT WITH UNSTRUCTURED COVARIANCE FOR EACH GENDER"; proc mixed method=ml data=dent1; by gender; class child; model distance = age / solution; repeated / type = un subject=child r rcorr; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Results only presented for GIRLS (gender=0) ----------------------------------- gender=0 ----------------------------------- The Mixed Procedure Model Information Data Set WORK.DENT1 Dependent Variable distance Covariance Structure Unstructured Subject Effect child Estimation Method ML Residual Variance Method None Fixed Effects SE Method Model-Based Degrees of Freedom Method Between-Within Class Level Information Class Levels Values child 11 1 2 3 4 5 6 7 8 9 10 11 Dimensions Covariance Parameters 10 Columns in X 2 Columns in Z 0 Subjects 11 Max Obs Per Subject 4 Number of Observations Number of Observations Read 44 Number of Observations Used 44 Number of Observations Not Used 0 Iteration History Iteration Evaluations -2 Log Like Criterion 0 1 190.75564656 1 2 130.64154698 0.00000000 Convergence criteria met. Estimated R Matrix for child 1 Row Col1 Col2 Col3 Col4 1 4.1129 3.0512 3.9496 3.9689 2 3.0512 3.2894 3.6632 3.7080 3 3.9496 3.6632 5.0966 4.9788 4 3.9689 3.7080 4.9788 5.4076 Estimated R Correlation Matrix for child 1 Row Col1 Col2 Col3 Col4 1 1.0000 0.8295 0.8627 0.8416 2 0.8295 1.0000 0.8946 0.8792 3 0.8627 0.8946 1.0000 0.9484 4 0.8416 0.8792 0.9484 1.0000 Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) child 4.1129 UN(2,1) child 3.0512 UN(2,2) child 3.2894 UN(3,1) child 3.9496 UN(3,2) child 3.6632 UN(3,3) child 5.0966 UN(4,1) child 3.9689 UN(4,2) child 3.7080 UN(4,3) child 4.9788 UN(4,4) child 5.4076 Fit Statistics -2 Log Likelihood 130.6 AIC (smaller is better) 154.6 AICC (smaller is better) 164.7 BIC (smaller is better) 159.4 ---------------------------------------------------------------------- The hypothesis test is H0: Sigma is sigma2 x Identity vs H1 Sigma is totally unstructured. We see that the p-value is very small (<0.0001) suggesting strong evidence against H0. ---------------------------------------------------------------------- Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 9 60.11 <.0001 FIT WITH UNSTRUCTURED COVARIANCE FOR EACH GENDER 5 ----------------------------------- gender=0 ----------------------------------- The Mixed Procedure Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 17.4220 0.6930 10 25.14 <.0001 age 0.4823 0.06144 10 7.85 <.0001 ---------------------------------------------------------------------- Type 3 tests are partial tests in the sense that they account for all the other fixed effects in the model. ---------------------------------------------------------------------- Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F age 1 10 61.62 <.0001 FIT WITH UNSTRUCTURED COVARIANCE FOR EACH GENDER 6 *******************************************************************/ /******************************************************************* Consider the 'DIFFERENCE PARAMETERIZATION' (both genders simultaneously) Fit the model with specified fixed effects, and different covariance structures Examine the AIC, BIC to get choose the most appropriate covariance structure *******************************************************************/ /******************************************************************* Consider several models, allowing the covariance matrix to be either the same or different for each gender using the GROUP = option, which allows for different covariance parameters for each GROUP (genders here). For all the fits, we just use the difference parameterization. At the end we illustrate how to fit one covariance structure (TYPE = CS) with the 'explicit parameterization'. The CHISQ option in the MODEL statement requests that the Wald chi-square test statistics be printed for certain contrasts of the regression parameters. When the 'difference parameterization' then the TESTS OF FIXED EFFECTS test for different intercepts/slopes between the groups. This will be discussed in more detail soon. *******************************************************************/ * compound symmetry with the "difference" parameterization; * same for each gender; title "COMMON COMPOUND SYMMETRY STRUCTURE"; proc mixed method=ml data=dent1; class gender child; model distance = gender age gender*age / solution chisq; repeated / type = cs subject = child r rcorr; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Iteration History Iteration Evaluations -2 Log Like Criterion 0 1 478.24175986 1 1 428.63905802 0.00000000 Covariance Parameter Estimates Cov Parm Subject Estimate CS child 3.0306 Residual 1.8746 Fit Statistics -2 Log Likelihood 428.6 AIC (smaller is better) 440.6 AICC (smaller is better) 441.5 BIC (smaller is better) 448.4 *******************************************************************/ * ar(1) same for each gender; title "COMMON AR(1) STRUCTURE"; proc mixed method=ml data=dent1; class gender child ; model distance = gender age age*gender / solution chisq; repeated / type = ar(1) subject=child r rcorr; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Covariance Parameter Estimates Cov Parm Subject Estimate AR(1) child 0.6071 Residual 4.8910 Fit Statistics -2 Log Likelihood 440.7 AIC (smaller is better) 452.7 AICC (smaller is better) 453.5 BIC (smaller is better) 460.5 *******************************************************************/ * one-dependent same for each gender; title "COMMON ONE-DEPENDENT STRUCTURE"; proc mixed method=ml data=dent1; class gender child ; model distance = gender age age*gender / solution chisq; repeated / type = toep(2) subject=child r rcorr; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Covariance Parameter Estimates Cov Parm Subject Estimate TOEP(2) child 1.6120 Residual 4.5294 Fit Statistics -2 Log Likelihood 457.4 AIC (smaller is better) 469.4 AICC (smaller is better) 470.2 BIC (smaller is better) 477.2 *******************************************************************/ * compound symmetry, different for each gender; title "SEPARATE COMPOUND SYMMETRY FOR EACH GENDER"; proc mixed method=ml data=dent1; class gender child ; model distance = gender age age*gender / solution chisq; repeated / type = cs subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Covariance Parameter Estimates Cov Parm Subject Group Estimate Variance child gender 0 0.5900 CS child gender 0 3.8804 Variance child gender 1 2.7577 CS child gender 1 2.4463 Fit Statistics -2 Log Likelihood 408.8 AIC (smaller is better) 424.8 AICC (smaller is better) 426.3 BIC (smaller is better) 435.2 *******************************************************************/ * ar(1), different for each gender; title "SEPARATE AR(1) FOR EACH GENDER"; proc mixed method=ml data=dent1; class gender child ; model distance = gender age age*gender / solution chisq; repeated / type = ar(1) subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* *******************************************************************/ * one-dependent, different for each gender; title "SEPARATE ONE-DEPENDENT FOR EACH GENDER"; proc mixed method=ml data=dent1; class gender child; model distance = gender age age*gender / solution chisq; repeated / type = toep(2) subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Covariance Parameter Estimates Cov Parm Subject Group Estimate Variance child gender 0 3.7093 TOEP(2) child gender 0 2.0415 Variance child gender 1 4.9891 TOEP(2) child gender 1 1.3289 Fit Statistics -2 Log Likelihood 444.6 AIC (smaller is better) 460.6 AICC (smaller is better) 462.1 BIC (smaller is better) 471.0 *******************************************************************/ * fitting a model using 'explicit parameterization'; * compound symmetry with separate intercept and slope for; * each gender; title "COMMON COMPOUND SYMMETRY STRUCTURE"; proc mixed method=ml data=dent1; class gender child; model distance = gender gender*age / noint solution ; repeated / type = cs subject = child r rcorr; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Note: the estimates of the covariance parameters are identical to the fitting when the 'difference parameterization' was selected As well the likelihoods at the estimated parameter values are the identical, irrespective of whether the 'difference' or the 'explicit' parameterization is selected. ---------------------------------------------------------------------- Covariance Parameter Estimates Cov Parm Subject Estimate CS child 3.0306 Residual 1.8746 Fit Statistics -2 Log Likelihood 428.6 AIC (smaller is better) 440.6 AICC (smaller is better) 441.5 BIC (smaller is better) 448.4 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 1 49.60 <.0001 Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 17.3727 1.1615 25 14.96 <.0001 gender 1 16.3406 0.9631 25 16.97 <.0001 age*gender 0 0.4795 0.09231 79 5.20 <.0001 age*gender 1 0.7844 0.07654 79 10.25 <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F gender 2 25 255.79 <.0001 age*gender 2 79 66.01 <.0001 NOTE: The "Type 3 Tests of Fixed Effects" table contains hypothesis tests for the significance of each of the fixed effects—that is, those effects you specify in the MODEL statement. By default, PROC MIXED computes these tests by first constructing a Type 3 'L' matrix for each effect. This matrix is then used to compute the following F statistic *******************************************************************/ /******************************************************************* Examination of the AIC, BIC, and loglikelihood ratios from the above fits indicates that - a model that allows a separate covariance matrix of the same type for each gender is preferred - the compound symmetry structure for each gender is preferred Next we fit this model and do inference about the parameter beta. The parameterization chosen is important ! To print the covariance matrix of the parameter estimator, beta-hat use the COVB option; For hypothesis testing of various hypotheses about beta, use Likelihood Ratio Test. The reduced mode - will define the null hypothesis ; the full model will define the alternative hypothesis. For example we are interested whether the slopes for the 2 groups are the same. The reduced model is given by 'equal slopes'; the full model is given by unequal slopes. The output using the difference parameterization gives the p-value for this test automatically. This approach will be presented first. Using the explicit parameterization, such test entails first fitting the full model, then fitting the reduced model and then calculating the value of the likelihood ratio test (by hand from the output) and comparing against the .95% critical value of chi-square with one degrees of freedom. This approach will be presented second. (We fit the first parameterization this time, so that the estimates are interpreted as the gender-specific intercepts and slopes.) Thus, the TESTS OF FIXED EFFECTS in the output should be disregarded. *******************************************************************/ * fit full model in difference parameterization to get chi-square tests; title "FULL MODEL, DIFFERENCE PARAMETERIZATION"; proc mixed method=ml data=dent1; class gender child; model distance = gender age gender*age / solution chisq covb; repeated / type=cs subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 3 69.43 <.0001 Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| Intercept 16.3406 1.1130 25 14.68 <.0001 gender 0 1.0321 1.3890 25 0.74 0.4644 gender 1 0 . . . . age 0.7844 0.09283 79 8.45 <.0001 age*gender 0 -0.3048 0.1063 79 -2.87 0.0053 ULL MODEL, DIFFERENCE PARAMETERIZATION 29 The Mixed Procedure Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| age*gender 1 0 . . . . Covariance Matrix for Fixed Effects Row Effect gender Col1 Col2 Col3 Col4 Col5 1 Intercept 1.2388 -1.2388 -0.09480 0.09480 2 gender 0 -1.2388 1.9294 0.09480 -0.1243 3 gender 1 4 age -0.09480 0.09480 0.008618 -0.00862 5 age*gender 0 0.09480 -0.1243 -0.00862 0.01130 6 age*gender 1 Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F gender 1 25 0.55 0.55 0.4575 0.4644 age 1 79 141.37 141.37 <.0001 <.0001 age*gender 1 79 8.22 8.22 0.0041 0.0053 *******************************************************************/ * full model again with covariance matrix of betahat printed; title "FULL MODEL WITH COMPOUND SYMMETRY FOR EACH GENDER"; proc mixed method=ml data=dent1; class gender child; model distance = gender gender*age / noint solution covb; repeated / type=cs subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Fit Statistics -2 Log Likelihood 408.8 AIC (smaller is better) 424.8 AICC (smaller is better) 426.3 BIC (smaller is better) 435.2 Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 17.3727 0.8311 25 20.90 <.0001 gender 1 16.3406 1.1130 25 14.68 <.0001 age*gender 0 0.4795 0.05179 79 9.26 <.0001 age*gender 1 0.7844 0.09283 79 8.45 <.0001 *******************************************************************/ * reduced model; title "REDUCED MODEL WITH COMPOUND SYMMETRY FOR EACH GENDER"; proc mixed method=ml data=dent1; class gender child; model distance = gender age / noint solution covb; repeated / type=cs subject=child r rcorr group=gender; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Fit Statistics -2 Log Likelihood 416.6 AIC (smaller is better) 430.6 AICC (smaller is better) 431.7 BIC (smaller is better) 439.7 Solution for Fixed Effects Standard Effect gender Estimate Error DF t Value Pr > |t| gender 0 16.6218 0.7945 25 20.92 <.0001 gender 1 18.9429 0.6790 25 27.90 <.0001 age 0.5478 0.04681 80 11.70 <.0001 *******************************************************************/ * full model using REML (the default, so no METHOD= is specified); * use ESTIMATE statement to estimate the mean for a boy of age 11; title "FULL MODEL WITH COMPOUND SYMMETRY FOR EACH GENDER, REML"; proc mixed data=dent1; class gender child; model distance = gender gender*age / noint solution covb; repeated / type=cs subject=child r rcorr group=gender; estimate 'boy at 11' gender 0 1 gender*age 0 11; contrast 'b vs g at 11' gender 0 1 gender*age 0 11; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Effect gender Estimate Error DF t Value Pr > |t| gender 0 17.3727 0.8587 25 20.23 <.0001 gender 1 16.3406 1.1287 25 14.48 <.0001 age*gender 0 0.4795 0.05259 79 9.12 <.0001 age*gender 1 0.7844 0.09382 79 8.36 <.0001 FULL MODEL WITH COMPOUND SYMMETRY FOR EACH GENDER, REML 41 FULL MODEL WITH COMPOUND SYMMETRY FOR EACH GENDER, REML 41 The Mixed Procedure Covariance Matrix for Fixed Effects Row Effect gender Col1 Col2 Col3 Col4 1 gender 0 0.7374 -0.03042 2 gender 1 1.2740 -0.09681 3 age*gender 0 -0.03042 0.002766 4 age*gender 1 -0.09681 0.008801 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F gender 2 25 309.43 <.0001 age*gender 2 79 76.53 <.0001 Estimates Standard Label Estimate Error DF t Value Pr > |t| boy at 11 24.9688 0.4572 79 54.61 <.0001 Contrasts Num Den Label DF DF F Value Pr > F b vs g at 11 1 79 2982.25 <.0001 *******************************************************************/ proc mixed data=dent1; class gender child; model distance = gender|age|age|age / htype=1; repeated / type=un sub=child; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* Class Level Information Class Levels Values gender 2 0 1 child 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Dimensions Covariance Parameters 10 Columns in X 12 Columns in Z 0 Subjects 27 Max Obs Per Subject 4 Number of Observations Number of Observations Read 108 Number of Observations Used 108 Number of Observations Not Used 0 Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 497.06600535 1 1 440.60995993 0.00000000 Convergence criteria met. Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) child 5.4155 UN(2,1) child 2.7168 UN(2,2) child 4.1848 UN(3,1) child 3.9102 UN(3,2) child 2.9272 UN(3,3) child 6.4557 UN(4,1) child 2.7102 UN(4,2) child 3.3172 UN(4,3) child 4.1307 UN(4,4) child 4.9857 Fit Statistics -2 Res Log Likelihood 440.6 AIC (smaller is better) 460.6 AICC (smaller is better) 463.1 BIC (smaller is better) 473.6 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 9 56.46 <.0001 Type 1 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F gender 1 25 9.29 0.0054 age 1 25 99.45 <.0001 age*gender 1 25 5.12 0.0326 age*age 1 25 1.39 0.2497 age*age*gender 1 25 1.15 0.2935 age*age*age 1 25 0.15 0.6974 age*age*age*gender 1 25 0.27 0.6081 *******************************************************************/ /******************************************************************* ******************************************************************* COMPARE WITH PROC GLM [REPEATED statement, orthogonal POLYNOMIAL contrasts] ******************************************************************* *******************************************************************/ * first re-arrange the data to be in the 'myltivariate mode' ; data dent1; set dent1; if age=8 then age=1; if age=10 then age=2; if age=12 then age=3; if age=14 then age=4; drop obsno; run; proc print data=dent1; run; data dent2(keep=age1-age4 gender child); array aa{4} age1-age4; do age=1 to 4; set dent1; by gender child; aa{age}=distance; if last.child then return; end; run; proc print data=dent2; run; * use PROC GLM to test the polynomial contrasts; * the results will not matched exactly those obtained with procMIXED, because in GLM the contrasts are normalized, while in MIXED they are not; proc glm data=dent2; class gender; model age1 age2 age3 age4 = gender / nouni; repeated age 4 (8 10 12 14) polynomial /summary nou nom printm; run; /******************************************************************* PARTIAL OUTPUT ******************************************************************* FIT WITH UNSTRUCTURED COVARIANCE FOR EACH GENDER 17 The GLM Procedure Repeated Measures Analysis of Variance Analysis of Variance of Contrast Variables age_N represents the nth degree polynomial contrast for age Contrast Variable: age_1 Source DF Type III SS Mean Square F Value Pr > F Mean 1 208.2660038 208.2660038 88.00 <.0001 gender 1 12.1141519 12.1141519 5.12 0.0326 Error 25 59.1673295 2.3666932 Contrast Variable: age_2 Source DF Type III SS Mean Square F Value Pr > F Mean 1 0.95880682 0.95880682 0.92 0.3465 gender 1 1.19954756 1.19954756 1.15 0.2935 Error 25 26.04119318 1.04164773 Contrast Variable: age_3 Source DF Type III SS Mean Square F Value Pr > F Mean 1 0.21216330 0.21216330 0.08 0.7739 gender 1 0.67882997 0.67882997 0.27 0.6081 Error 25 62.91931818 2.51677273 *******************************************************************/