lab08, homework
 

SASCODE 8

Transfer Function Simulation * For this homework we will try some simulation. Create an AR(1) series X(t); Data GenX; Z=0; do t = 1 to 100; Z=.8*Z+normal(1827655); X= 100 + 3*Z; output; end; proc gplot; plot X*T; symbol1 v=none i=join; title "X Series"; * Estimate an AR(2) model. Is the second order autoregressive coefficient significant? Should it be? Estimate an AR(1) model for the data. How many standard errors away from the true value is the estimated mean? How many standard errors away from the true value is the autoregressive coefficient? Generate a series Y(t) = -350 + 3*X(t-1) + 2*X(t-2) + e(t) + .6*e(t-1); Data GenY; set GenX; e1=e; e=normal(12345); if _n_=1 then e1=normal(8523609); Y = -350 +3*X1 + 2*X2 +e + .6*e1; X2=X1; X1=X; retain X1 X2 e; PROC GPLOT; PLOT (X Y)*t/overlay; * What is the theoretical correlation between Y(t) and X(t)? Compute the theoretical correlation between Y(t)-.8 Y(t-1) and X(t-j) - .8 X(t-j-1) for j=0, 1, 2, 3, 4. Compute the theoretical mean of Y. Notice that the transformation from Y(t) to Y(t)-.8 Y(t-1) and similarly for X gives cross correlations that show the true relationship between Y and X. Why is .8 used here? What sort of time series model describes the tranformed series X(t)-.8 X(t-1)? Compute the cross correlations between X(t) and Y(t) as follows; PROC ARIMA data=GENY; IDENTIFY VAR=Y CROSSCOR=(X); * Explain why these cross correlations, even though they are computed correctly are not as informative as one might like. Compute the cross-correlations between the transformed Y and X. This is easy in PROC ARIMA. If you have a model for X before your Y identify statement, SAS will automatically remember your X model and use it to "prewhiten" (transform) both X and Y for the cross-correlations. Which of these SHOULD be significant? Does the simulation match theory reasonably well? Just comment on the estimated cross-correlations. The ESTIMATE statement has an INPUT option. For example ESTIMATE INPUT = (X1 X2) p=2 q=1 fits a regression of Y, say, on X1 and X2 with an ARMA(2,1) error series. The statement ESTIMATE INPUT = (X1 X2) PLOT will produce the ACF etc. of the regression residuals. Any INPUT variable can be preceded by a backshift operator. The form is s$(n1, n2, ..., nq)/(d1, d2, ..., dp) where s is a pure delay, n1, n2 etc. are numerator lags and d1, d2, ... are denominator lags. Examples 3$(1,2) indicates (A0 - A1 B - A2 B**2)X(t-3) 2$ / (1) indicates A0/(1 - A1 B)X(t-2) = A0 (X(t-2) + A1 X(t-3) + A1**2 X(t-4) + A1**3 X(t-5) + ...) and $(1)/(1) indicates (A0 - A1 B)/(1 - A2 B) X(t) where A0, A1, etc. are parameters to be estimated With these facts in mind, fit the input model suggested by the crosscorrelations. Use the PLOT option and explain whether or not the estimated ACF etc. have the shapes they should based on the true error model used to generate the data. Now refit with the appropriate error model, forecast 10 periods ahead and plot the original series with the forecasts and upper and lower 95% prediction limits coming off the end of the historic data. Note: In PROC ARIMA, if you model X then fit a transfer function asking for forecasts, ARIMA will automatically generate enough X forecasts to allow the requested Y forecasts to be computed. The error in forecasting X will be incorporated into the Y forecast intervals. Why are the first couple of forecast intervals so much narrower than the others? If you SUPPLY future values of X they are treated as known in advance, that is, no error in future Xs is assumed. This is like PROC AUTOREG where future values of the inputs MUST be supplied and are assumed to be known without error. With this in mind you might want to (Optional - not graded) append the forecasts of X into the X data set and rerun the transfer function model. Your forecasts should match those above but the intervals should be narrower as suggested in the above paragraph. ;