Time Series homework 2
 

Homework 2

Regression with Time Series Errors Questions (note: Click on SAS Code at the bottom to see the initial SAS code for this problem - copy and paste it into SAS then modify as needed) Data were supplied by meteorology graduate student Bill Barnard (also of the USEPA). Data contain information on black carbon in the atmosphere and PM-10 among other things. PM-10 is the amount of particulate matter (PM) in the air that would be trapped by a filter with a certain pore size (10 micrometers) I asked Mr. Barnard to explain the other items in the dataset . In his words: "... It is taken from the Southern California Ozone Study that was done in 1997. It was a very heavily instrumented intensive study done in the LA Basin. The data here is from Riverside, CA. DUV is an integrated daily dosage of UV radiation in the 286-400 nanometer wavelength region. TOMS is an acronym for Total Ozone Mapping Spectrophotometer that NASA uses on the NIMBUS satellite to measure,among other things, the stratospheric ozone levels all over the earth. It is in polar orbit. It gives only one daily value for just about any location almost every day. Black carbon is just what you said. Its units are micrograms/cubic meter. PM-10 is the same thing as your statement. Particulate matter, 10 micrometers or less in diameter that is collected on an hourly basis. The ozone value is a measurement of the ground-level ozone pollution in parts per million. "
  1. The data have used so-called Julian dates. The Julian date for Feb2 of 1998 is 1998033 because Feb. 2 is the 033 day of the year. Sometimes just the day of the year (033) rather than the whole thing is recorded. That is the case here, but we know the year is 1997. Create a SAS date variable with format date7. Print out the first 5 observations from your dataset with a nice descriptive title.
  2. Plot all the meteorological variables versus your nicely formatted date. You can put each plot on a different page, but if you want to learn more about SAS you can Use a Template
  3. Run a correlation among all the meteorological variables. What other variables are highly correlated with DUV, the response of interest? Use PROC CORR in SAS. What assumptions are usually made when p-values are computed for correlations? Ignoring the normality assumption, what other assumption is likely to be violated when data are taken over time like this?
  4. Regress DUV on the other meteorological variables using PROC REG. Output the residuals r. Plot r against Lr=lag(r), i.e. r against its lagged value. You might try a gplot here along with SYMBOL1 V=DOT I=R C=RED; Check the SAS log window here. I=JOIN connects points with lines in the order encountered and I=NONE is obvious, but what does I=R do?? Make a histogram of the residuals. PROC GCHART or PROC CHART would be good choices here.
  5. Regress r on 3 of its lags. Assuming that the regression statistics are OK in large samples (see Fuller's text for a proof that they are, under rather mild assumptions) give the F test to see if you can leave our all but the first lag.
  6. Regardless of (5), regress r on Lr where Lr is the lag of r and, again assuming the test statistics are valid, discuss the statistical significance. This is one way to estimate an AR(1) structure for the residuals. Regardless of significance here, write down (using say 3 decimal accuracy) the estimated 4x4 Toeplitz covariance matrix of any set of 4 contiguous residuals using your estimated AR(1) structure. What is your conclusion about the regression in part (4)? Specifically, are your estimated coefficients unbiased? Can I trust the standard errors? Can I trust the p-values for my test statistics?
  7. Rerun the regression from part (4) changing PROC REG to PROC AUTOREG . Ask for teh Durbin-Watson statistic P-value. What is the DW statistic, what is its P-valeu and what is the implication? At the end of your MODEL statement, before the semicolon, put / NLAG=3 BACKSTEP This will fit 3 lags to r then using t tests, eliminate the insignificant lags (BACKSTEP). How does the initial regression compare to PROC REG? The procedure has used the estimated autocorrelation structure to fit a generalized least squares GLS, or more specifically an estimated GLS - EGLS, regression. Summarize for the client how the coefficient and p-value on black carbon (the focus of his thesis) changes when you correct for autocorrelation like this.
  8. Using t test statistics, eliminate one at a time, starting with the least significant, the terms in the model that are not significant at the 10% level. Use PROC AUTOREG. Just hand in a summary indicating which term was omitted at each step, and its EGLS p-value. I believe you will find a model that contains black carbon (and maybe other things), but its p-value is not less than 0.05. Now the presence of black carbon in the atmosphere could not possibly increase the amount of radiation DUV. It could only decrease. Explain how prior knowledge of this fact could be helpful to our client who is interested in showing an effect of black carbon on DUV. Use the final model you got by the model fitting you just did.
Short questions:
  1. Here are some theoretical autocorrelations. Give the AR, MA, or ARMA model that would give these autocorrelations Lag 0 1 2 3 4 5 6 7 8 9 10 Model I 1 .5 .25 .125 .0625 .03125 ..... Model II 1 0.2 0 0 0 0 .....
  2. A moving average order 1 model has mean 90 and error variance Var( e(t) ) = 100. The model is Y(t) - 90 = e(t) - .8 e(t-1). My last two observations are Y(99) = 105 and Y(100) = 98. I do not care about the next observation, Y(101), but I do want to predict the average ( Y(102)+Y(103)+Y(104)+Y(105) )/4.
    • What would be the best predictor (BLUP) of this average of future values, assuming all the given model parameters are known values, not estimates?
    • Find the variance of an individual Y value ( Gamma(0) )
    • Find the variance of the mean of 4 values given above - note that they are NOT uncorrelated with each other.
    • Find the correlation of that mean of 4 with each past data value Y(1), ...,Y(100) and if needed, go back and correct your answer to the first part!
SAS Code SAS Online Documentation