lab08, homework
SASCODE 8
Transfer Function Simulation
* For this homework we will try some simulation.
Create an AR(1) series X(t);
Data GenX;
Z=0; do t = 1 to 100; Z=.8*Z+normal(1827655);
X= 100 + 3*Z; output; end;
proc gplot; plot X*T;
symbol1 v=none i=join; title "X Series";
* Estimate an AR(2) model. Is the second order autoregressive
coefficient significant? Should it be?
Estimate an AR(1) model for the data. How many standard errors
away from the true value is the estimated mean? How many
standard errors away from the true value is the autoregressive
coefficient?
Generate a series
Y(t) = -350 + 3*X(t-1) + 2*X(t-2) + e(t) + .6*e(t-1);
Data GenY; set GenX; e1=e; e=normal(12345);
if _n_=1 then e1=normal(8523609);
Y = -350 +3*X1 + 2*X2 +e + .6*e1; X2=X1; X1=X; retain X1 X2 e;
PROC GPLOT; PLOT (X Y)*t/overlay;
* What is the theoretical correlation between
Y(t) and X(t)? Compute the theoretical correlation
between Y(t)-.8 Y(t-1) and X(t-j) - .8 X(t-j-1) for
j=0, 1, 2, 3, 4. Compute the theoretical mean of Y.
Notice that the transformation from Y(t) to Y(t)-.8 Y(t-1)
and similarly for X gives cross correlations that show
the true relationship between Y and X. Why is .8 used
here? What sort of time series model describes the
tranformed series X(t)-.8 X(t-1)?
Compute the cross correlations between X(t) and Y(t) as
follows;
PROC ARIMA data=GENY;
IDENTIFY VAR=Y CROSSCOR=(X);
* Explain why these cross correlations, even though they
are computed correctly are not as informative as one might
like.
Compute the cross-correlations between the transformed Y and X.
This is easy in PROC ARIMA. If you have a model for X before
your Y identify statement, SAS will automatically remember
your X model and use it to "prewhiten" (transform) both
X and Y for the cross-correlations. Which of these SHOULD be
significant? Does the simulation match theory reasonably well?
Just comment on the estimated cross-correlations.
The ESTIMATE statement has an INPUT option. For example
ESTIMATE INPUT = (X1 X2) p=2 q=1
fits a regression of Y, say, on X1 and X2 with an ARMA(2,1)
error series. The statement
ESTIMATE INPUT = (X1 X2) PLOT
will produce the ACF etc. of the regression residuals.
Any INPUT variable can be preceded by a backshift operator.
The form is s$(n1, n2, ..., nq)/(d1, d2, ..., dp)
where s is a pure delay, n1, n2 etc. are numerator lags and
d1, d2, ... are denominator lags.
Examples
3$(1,2) indicates (A0 - A1 B - A2 B**2)X(t-3)
2$ / (1) indicates A0/(1 - A1 B)X(t-2)
= A0 (X(t-2) + A1 X(t-3) + A1**2 X(t-4) + A1**3 X(t-5) + ...)
and
$(1)/(1) indicates (A0 - A1 B)/(1 - A2 B) X(t)
where A0, A1, etc. are parameters to be estimated
With these facts in mind, fit the input model suggested by the
crosscorrelations. Use the PLOT option and explain whether or
not the estimated ACF etc. have the shapes they should based on
the true error model used to generate the data.
Now refit with the appropriate error model, forecast 10
periods ahead and plot the original series with the forecasts
and upper and lower 95% prediction limits coming off the end
of the historic data. Note: In PROC ARIMA, if you model X
then fit a transfer function asking for forecasts, ARIMA
will automatically generate enough X forecasts to allow the
requested Y forecasts to be computed. The error in forecasting
X will be incorporated into the Y forecast intervals. Why are
the first couple of forecast intervals so much narrower than
the others?
If you SUPPLY future values of X they are treated as known in
advance, that is, no error in future Xs is assumed. This is
like PROC AUTOREG where future values of the inputs MUST
be supplied and are assumed to be known without error. With
this in mind you might want to
(Optional - not graded)
append the forecasts of X into the X data set and rerun the
transfer function model. Your forecasts should match those
above but the intervals should be narrower as suggested in
the above paragraph.
;