
This course will provide a detailed treatment of regression
models and associated inferential methods for both univariate and
multivariate (e.g. repeated measures) response. The techniques to be
discussed are now an essential part of the modern statistician's
toolkit and are widely used in numerous application areas.
The first 1/2 to 2/3 of the course will focus on nonlinear regression
models for univariate response, including models for nonconstant
response variance. The remainder of the course will be devoted to
introduction to extension of the univariate model to two popular types
of nonlinear regression models for multivariate response: (i)
"Population-averaged" models and models for covariance structure will
discussed; methods for fitting these models are popularly known in the
literature as "generalized estimating equations" (GEEs), and (ii)
"Subject-specific" models, e.g., generalized linear and nonlinear
mixed effects models.
Properties of competing inferential techniques and the effects of
model misspecification will be studied via theoretical arguments
carried out at a nonrigorous, heuristic level and via simulation
exercises on the part of students. Although theoretical arguments
will be reviewed in class in some detail, and students will be
expected to understand and be able to carry out similar arguments at
the same level, the main objective will be for students to appreciate
the implications of the results for practice rather than the technical
details. Implementation of the methods and application to data will
be emphasized in the homework assignments.
The instructor last taught this course in Fall 2007.
(Return to top)
Course prerequisites

ST 512R,
Experimental Statistics for Biological Sciences II;
ST 552,
Linear Models and Variance Components; and familiarity with
SAS or R/Splus and a scientific computing language (e.g. MATLAB,
FORTRAN, C++, SAS IML, etc). Students should have a strong
background in probability and inference at the level of
ST 521
and ST 522
(the prerequisites for ST 552).
(Return to top)
Course topics
- Introduction and motivation
- Models for univariate response
- Introduction to nonlinear models
- Implementation of generalized least squares (GLS),
iteratively reweighted least squares
- Generalized (non)linear models, quasilikelihood
- Normal theory maximum likelihood (ML)
- Unknown parameters in the variance function
- Detecting and modeling nonconstant variance
- Large sample theory - a casual approach
- The "folklore" theorem and "optimality" of GLS
- Linear vs. quadratic estimating equations for the regression
parameter
- Effect of estimating weights in GLS
- Estimation of unknown parameters in variance function models
- Models for multivariate
response
- Modeling multivariate response - sources of correlation
and "subject-specific" vs. "population-averaged" approaches
- Generalized estimating equation methods for
population-averaged models
- Nonlinear and generalized linear mixed effects
(subject-specific) models - approximate and "exact" methods
See the class notes below for more detailed information
(Return to top ).
Syllabus
(
Return to top)
Class notes

Class notes in pdf format
If you are taking this class in Fall
2009, you will need to purchase the notes at Sir Speedy on
Hillsborough Street. Notes have been updated since Fall 2008 when
this course was last taught.
(
Return to top)
Homework assignments and tentative due dates
(Return to top)
Homework solutions
- Homework
1 Solutions, GLS algorithm
program in R and output, and GLS algorithm program in R and output.
Homework
1 Extra Problems Solutions, and program (in R) and output for Problem 6.
- Homework
2 Solutions, program
in R and output for
Problem 1; and GLS algorithm
program in R and output, and
IRWLS algorithm program in R
and output for Problem 2 (c)
and (d).
Homework
2 Extra Problems Solutions.
- Homework
3 Solutions, program
in R and output for
Problem 1(b); program
in R and output for
Problem 1(c); program
in R and output for
Problem 1(d); and program
in R and output for
Problem 1(e); and program
in R and output for
Problem 2.
Homework
3 Extra Problems Solutions.
- Homework
4 Solutions.
Homework
4 Extra Problems Solutions.
- Homework
5 Solutions,
program and output for
Problem 2(d); program and output for Problem 2(e); program and output for Problem 2(f); and program and output for Problem 3.
Note: The statement of Problem 2 was confusing; although the model
does contain an unknown variance parameter (sigma), the wording
implied that the only unknown covariance parameter was the correlation
parameter (alpha). Thus, some of you excluded the squared deviation
terms from the "response" vector. The solution given here includes
those terms. Because of the confusion, either solution is acceptable.
Homework
5 Extra Problems Solutions.
Homework
6 Solutions,
program and output for
Problem 1; and program and output for Problem 2(a), program and output for Problem 2(b), and
program and output for Problem 2(c).
Homework
6 Extra Problems Solutions.
(Return to top)
Data analysis project

- The data analysis project is due on Tuesday,
October 20, 2009. Here is the data set. Please be sure to type your
report, and prepare it for the investigators (not for me)!
(Return to top )
Test

(Return to top)
Final project

(Return to top)
SAS and R examples (in class notes)

- Section 3.7, Program 3.1. IRWLS with theta known, SAS
program and output .
- Section 3.7, Program 3.2 . GLS algorithm with theta known, SAS
program and output .
- Section 3.7, Program 3.3. GLS algorithm with theta known, R program and output .
- Section 6.8, Program 6.1. GLS algorithm with theta unknown and
estimated, SAS program
and output using PL
(quadratic estimating equation).
- Section 6.8, Program 6.2. GLS algorithm with theta unknown and
estimated, R program
and output using PL
(quadratic estimating equation).
- Section 14.7, Program 14.1. Fitting multivariate data using GEE
methods: linear estimating equation for beta and simple moment methods
for correlation parameters using SAS proc genmod. Demonstrated on the
epileptic seizure data of Thall and Vail (1990), data set , program , and output .
- Section 14.7, Program 14.2. Fitting multivariate data using GEE
methods: linear estimating equation for beta and simple moment methods
for correlation parameters using R function gee(), program , output ,
and help file
for gee().
- Section 14.7, Program 14.3. Fitting multivariate data using GEE
methods: linear estimating equation for beta and quadratic estimating
equation for correlation parameters using SAS macro nlinmix. We use the most recent version available
from SAS technical support as in the class
notes,
program , log
file , and
output.
- Section 14.7, Program 14.4.
Fitting multivariate data using GEE methods: linear estimating
equation for beta and quadratic estimating equation for correlation
parameters using SAS proc glimmix, program and output .
- Section 15.6, Program 15.1. Fitting nonlinear mixed
effects models using a two-stage approach with the EM algorithm
for stage 2 using R,
data set and program and output.
- Section 15.6, Program 15.2. Fitting nonlinear mixed effects
models using a two-stage approach with mixed model software for stage
2, program to create "data"
, program to fit stage 2 using
proc mixed , and output.
- Section 15.6, Program
15.3. Fitting nonlinear mixed effects models using the
first-order linearization method with linear estimating equations
using SAS macro nlinmix for version 8.0 and above. Using the most recent version as in the
notes, program , log file , and output .
- Section 15.6, Program
15.4. Fitting nonlinear mixed effects models using the
refined linear approximation about empirical Bayes estimates of the
random effects using SAS macro nlinmix (most recent version), log file , and output .
- Section 15.6, Program
15.5. Fitting nonlinear mixed effects models using the
refined linear approximation about empirical Bayes estimates of the
random effects using R nlme(), program
and output .
- Section 15.6, Program
15.6. Fitting nonlinear mixed effects models using the
"exact" likelihood method with integration carried out via adaptive
Gaussian quadrature using SAS proc nlmixed, program and output .
- Section 15.6, Program
15.7. Fitting generalized linear mixed models using PQL
with SAS macro glimmix,
program , log file ,
and output ; and glimmix macro for version 8.0 and above of
SAS.
- Section 15.6, Program
15.8. Fitting generalized linear mixed effects models using
the "exact" likelihood method with integration carried out via
adaptive Gaussian quadrature using SAS proc nlmixed, program and output .
- Section 15.6, Program
15.9. Fitting generalized linear mixed effects models using
the "exact" likelihood method with integration carried out via
adaptive Gaussian quadrature using SAS proc glimmix (SAS version
9.2), program and output .
(Return to top)
Errata list

The errata list will be updated as we find typos!
Announcements (most recent shown
first)

- We're DONE! No class on December 1 and December 3.
- The DUE DATE for Homework 5 is changed to .
- There will be NO CLASS on Thursday,
November 5 in light of the TEST in the
evening.
- TYPO ALERT! There are typos
on page 216 of the notes; see the Errata List!
- Data analysis project will be handed out on Tuesday, October
13 and will be due on Tuesday, October 20!!!
- See Homework 2 above for information regarding TYPOs in the statement
of Problem 2!!!! (posted September 15)
- See Homework 1 above for information regarding a TYPO in the statement
of Problem 2!!!! (posted August 28)
(Return to top)