/****************************************************************** CHAPTER 11, EXAMPLE 2 Fit a loglinear regression model to the horse-kick data. (Poisson assumption) ******************************************************************/ options ls=80 ps=59 nodate; run; /****************************************************************** The data look like (first 6 records) year c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 1875 0 0 0 0 1 1 0 0 1 0 1876 0 0 1 0 0 0 0 0 1 1 1877 0 0 0 0 1 0 0 1 2 0 1878 2 1 1 0 0 0 0 1 1 0 1879 0 1 1 2 0 1 0 0 1 0 1880 2 1 1 1 0 0 2 1 3 0 ... column 1 year columns 2-11 number of fatal horsekicks suffered by corps 1-10. ******************************************************************/ data kicks; infile 'kicks.dat'; input year c1-c10; run; proc print data=kicks ; run; /****************************************************************** Reconfigure the data so that the a single number of kicks for a particular year/corps combination appears on a separate line. ******************************************************************/ data kicks2; set kicks; array c{10} c1-c10; do corps=1 to 10; kicks = c{corps}; output; end; drop c1-c10; run; proc print data=kicks2 ; run; /***************************************************************** Obs year corps kicks 1 1875 1 0 2 1875 2 0 3 1875 3 0 4 1875 4 0 5 1875 5 1 6 1875 6 1 7 1875 7 0 8 1875 8 0 9 1875 9 1 10 1875 10 0 *****************************************************************/ /***************************************************************** Fit the loglinear regression model using PROC GENMOD. Here, the dispersion parameter phi=1, so is not estimated. We let SAS form the dummy variables through use of the CLASS statement. This results in the model for mean response being parameterized as in equation (11.20). The DIST=POISSON option in the model statement specifies that the Poisson probability distribution assumption, with its requirement that mean = variance, be used. The LINK=LOG option asks for the loglinear model. Other LINK= choices are available. We also use a CONTRAST statement to investigate whether there is evidence to suggest that 1875 differed from 1880 in terms of numbers of horsekick deaths. The WALD option asks that the usual large sample chi-square test statistic be used as the basis for the test. *****************************************************************/ proc genmod data=kicks2; class year corps; model kicks = year corps / dist = poisson link = log; contrast '1875-1880' year 1 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 / wald; run; /**************************************************************************** The SAS System 6 The GENMOD Procedure Model Information Data Set WORK.KICKS2 Distribution Poisson Link Function Log Dependent Variable kicks Number of Observations Read 200 Number of Observations Used 200 Class Level Information Class Levels Values year 20 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 corps 10 1 2 3 4 5 6 7 8 9 10 Parameter Information Parameter Effect year corps Prm1 Intercept Prm2 year 1875 Prm3 year 1876 Prm4 year 1877 Prm5 year 1878 Prm6 year 1879 Prm7 year 1880 Prm8 year 1881 Prm9 year 1882 Prm10 year 1883 Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 171 171.6395 1.0037 Scaled Deviance 171 171.6395 1.0037 Pearson Chi-Square 171 160.6793 0.9396 Scaled Pearson X2 171 160.6793 0.9396 Log Likelihood -161.8886 Full Log Likelihood -185.6912 AIC (smaller is better) 429.3823 AICC (smaller is better) 439.6176 BIC (smaller is better) 525.0335 Algorithm converged. Analysis Of Maximum Likelihood Parameter Estimates Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Intercept 1 -2.0314 0.7854 -3.5707 -0.4921 6.69 year 1875 1 0.4055 0.9129 -1.3837 2.1947 0.20 year 1876 1 0.4055 0.9129 -1.3837 2.1947 0.20 year 1877 1 0.6931 0.8660 -1.0042 2.3905 0.64 year 1878 1 1.0986 0.8165 -0.5017 2.6989 1.81 year 1879 1 1.0986 0.8165 -0.5017 2.6989 1.81 year 1880 1 1.7047 0.7687 0.1981 3.2114 4.92 year 1881 1 0.9163 0.8367 -0.7235 2.5561 1.20 year 1894 0 0.0000 0.0000 0.0000 0.0000 . corps 1 1 0.4055 0.4564 -0.4891 1.3001 0.79 corps 2 1 0.4055 0.4564 -0.4891 1.3001 0.79 corps 3 1 -0.0000 0.5000 -0.9800 0.9800 0.00 corps 4 1 0.3185 0.4647 -0.5923 1.2292 0.47 corps 5 1 0.4055 0.4564 -0.4891 1.3001 0.79 corps 6 1 -0.1335 0.5175 -1.1479 0.8808 0.07 corps 7 1 0.4855 0.4494 -0.3952 1.3662 1.17 corps 8 1 0.6286 0.4378 -0.2295 1.4867 2.06 corps 9 1 1.0986 0.4082 0.2985 1.8988 7.24 corps 10 0 0.0000 0.0000 0.0000 0.0000 . Scale 0 1.0000 0.0000 1.0000 1.0000 Analysis Of Maximum Likelihood Parameter Estimates Parameter Pr > ChiSq Intercept 0.0097 year 1875 0.6569 year 1876 0.6569 year 1877 0.4235 Contrast Results Chi- Contrast DF Square Pr > ChiSq Type 1875-1880 1 3.98 0.0461 Wald **********************************************************************************************/