ST 732: Applied Longitudinal Data Analysis [SPRING 2013]

MW 1:30PM-2:45PM, 1108 SAS Hall       

Instructor: Dr. Ana-Maria Staicu [last name pronounced as "styku"]

Contact Information
Office: 5242 SAS Hall / Phone: 515-0644
Email: ana-maria_staicu [at]  ncsu [dot] edu 
Office Hours
Monday 11:00AM - 12:00PM
Friday      2:20PM -   3:00PM

Teaching Assistant: Shikai Luo. Office Hours: Tuesday 2-3PM, Wednesday 10-11AM. Room SAS Hall 1101

Syllabus 

Statistical software used in this course: SAS [on-line documentation for version 9.1.3 ]

Grading Policy:

Midterm       [covers 1st half]  Wednesday, February 27 (1:30PM - 2:45PM)     35%    
Final Exam [covers 2nd half] Monday,  May 6 (1:00PM - 4:00PM) 35%
Homework 10%
Project[presentation + report] 20%
Total
100%

Announcements


Class Evaluation at ClassEval


[M, Jan-7-2013] First Day of Class
[M, Jan-21-2013] Martin Luther King [NO CLASS]
[M, W March 4,6-2013] Spring Break [NO CLASS]
[W, March13] Details about the Final project deliverables and details. 
[W, April-24-2013] Last Day of Class

Lecture Notes

Chapter 1: Introduction, class organization, grading, course overview  

Chapter 2: Matrix review (individual study). Multivariate normal review

Chapter 3: Review of linear regression for univariate and multivariate responses

Chapter 4: Introduction to modeling longitudinal data
Chapter 4.1: Balanced design arising from single population. Correlation Structures
Chapter 4.2:  Balanced design arising from two or more populations
                   SAS code: Exploratory tools for mean and covariance (Proc CORR, DISCRIM)
                   R code: Basic exploratory tools for mean and covariance (scatterplot matrix)

Chapter 5: Univariate repeated measures ANOVA
Chapter 5.1: Introduction. Statistical model (Split Plot)
Chapter 5.2: Questions of interest and Statistical Hypotheses
Chapter 5.3: ANalysis Of VAriance
Chapter 5.4: Violation of covariance matrix assumption 
Chapter 5.5: Specialized within-unit hypotheses and tests
                 SAS code: Proc GLM  with random/repeated statement
                 Additional reference:  Proc GLM

Chapter 6: Multivariate repeated measures ANOVA
Chapter 6.1: Introduction.
Chapter 6.2: General multivariate problem
Chapter 6.3: Profile Analysis
                 SAS code: Proc GLM  with MANOVA/repeated statement
                 Additional example: Pigs Diet Data 

Chapter 7:
Limitations of the classical methods

Chapter 8: General linear models for longitudinal data
Chapter 8.1: Introduction
Chapter 8.2: General models for longitudinal data
Chapter 8.3: Modeling the covariance
Chapter 8.4: Mean regression parameters estimation: Maximum Likelihood and Restricted Maximum Likelihood
                Mean regression parameters distribution
                SAS code: Proc REG (OLS estimation)
                REML:  Direct derivation / Conditional likelihood
Chapter 8.5: Mean regression parameters inference. Model selection approaches (LRT, AIC, BIC)
Chapter 8.6: Final Remarks: main features and limitations
                SAS code: Proc MIXED/ repeated statement 
                SAS code: Proc MIXED  [Hip Study]

Sample Midterm
Review Part I
Midterm1Solutions


Chapter 9: Random Coefficient Model
Chapter 9.1: Introduction
Chapter 9.2: Random coefficient model
Chapter 9.3: Inference on mean regression parameters and covariance parameters
                     SAS code: Proc MIXED repeated/random statement 

Chapter 10: Linear Mixed Effects Model 
Chapter 10.1: General Linear mixed effects model
Chapter 10.2:  Inference on the regression parameters and covariance parameters
                     SAS code: Proc MIXED repeated/random statement
Chapter 10.3: Best Linear Unbiased Prediction (BLUP) for subjects effects and individual trajectories 
                     SAS code: Proc MIXED prediction
Chapter 10.4: Comparing nested models for the covariance: testing whether an effect is random
                     SAS code: Proc MIXED testing [Weigth lifting study]
Chapter 10.4: Accounting for covariate information

Chapter 11: Generalized Linear Models
Chapter 11.1: Introduction
Chapter 11.2: Three-part specification of GLM
Chapter 11.3: Estimation and inference for regression parameter
                     Iterative re-weighted least squares (IRWLS)
Chapter 11.4: Illustrative examples: Logistic Regression; Log llinear regression.
                     SAS code: Proc GENMOD [Myocard Infarction]
                     SAS code: Proc GENMOD [Horsekicks]
                     SAS code: Proc MIXED testing [Weigth lifting study]

Chapter 12: Population-averaged models (marginal models) for non-normal response measurements
Chapter 12.1: Introduction
Chapter 12.2:Specification of marginal models
Chapter 12.3: Estimation and inference for marginal models
                   Generalized Estimating Equations
                   SAS code: Proc GENMOD [Epileptic seizures]
                   SAS code: Proc GENMOD [Respiratory illness]
Chapter 12.4:  Generalized linear mixed models.
                   SAS code: Proc NLMIXED [Respiratory illness]
Chapter 12.5:  Population averaged vs subject specific approaches
Chapter 12.6  Illustrations.
                   SAS code: Proc NLMIXED [Epileptic seizures]

Sample Midterm
Review Part II


Homeworks (tentative deadlines)

Homework 1 (Due January 23, 2013)   Solution: HW1 soln. Additional files
Homework 2 (Due February 6, 2013)   Solution: HW2 soln. Additional files
Homework 3 (Due February 25, 2013) Solution: HW3 soln. Additional files
Homework 4 (Due March 27, 2013)     Solution: HW4 soln. Additional files
Homework 5 (Due April 11, 2013)       Solution: HW5 soln. Additional files

Group Project (email-me if you have preferences in regards to forming the groups)

Data submission (including brief description) due date:  March 27 [Wednesday]
Presentation due date: April 24 [Wednesday]
Report due date: April 26 [Friday]

Instruction for preparing your report : Each report should be no more than 10 pages long (12 points font and double spaced) with a separate title page and a separate references page (if any) and appendix which contains code and relevant output. The title / reference / appendix pages do not count towards the 10 page limit. Additional information.

Instruction for in-class presentation: Each presentation is 10 minutes long highlighting main points and findings from the project. It should not exceed 11 slides (excluding the title and references slides). Please be sure to finish within the time limit.

Schedule of the presentations:
1:30 - 1:40  Emma Morrison, Nicholas Meyer, Wesley Huneycutt
1:42 - 1:52  Guangning Xu, Yiqing Tian, Yunbo Cai
1:54 - 2:04  Kumud Dhakal
2:06 - 2:18  Sabina Rich, Merve Tekbudak, Sihan Wu
2:20 - 2:30  Lan Dong,  Lixia Zhang, Bo Shao
2:32 - 2:42  Bongseog Choi, So Young Park, Marcela Alfaro Cordoba

Data sets



Courses Objectives

To introduce students to statistical models and methods for the analysis of longitudinal data, i.e. data collected
repeatedly on individuals (humans, animals, plants, samples, etc) over time (or other conditions).

Prerequisite
ST 512, Experimental Statistics for Biological Sciences II, or equivalent. Thus, students should be familiar with basic notions of probability, random variables, and statistical inference, analysis of variance, and (multiple) linear regression. Familiarity with matrix algebra is also useful. We will review matrix algebra at the beginning of the course and make considerable use of matrix notation and operations throughout. ST 512 involves the use of the SAS (Statistical Analysis System) software package; thus, students are expected to have had some exposure to the use of SAS. The course is meant to be accessible both to non-majors and majors. The underlying mathematical theory will not be stressed, and the main focus will be on concepts and applications. Please see the instructor if you have questions about the suitability of your background.

Required Text
Lecture notes prepared by Marie Davidian will be used. These may be purchased at the Sir Speedy across the street from Patterson on Hillsborough. You should obtain a copy.

Useful Links: