Legacy Documentation

Time Series (2011)

This is documentation for an obsolete product.
Current products and services

Previous section-----Next section

1.5.2 The Asymptotic Distribution of the Sample Correlation Function

Let {Xt} be a stationary process with the correlation function . Let (h)=((1), (2), ... , (h)) and . It can be shown (see, for example, Brockwell and Davis (1987), p. 214) that under certain general conditions has the asymptotic joint normal distribution with mean (h) and variance C/n as n→. The (i, j) element of the matrix C, cij, is given by
This formula was first derived by Bartlett in 1946 and is called Bartlett's formula. Any stationary ARMA model with {Zt} distributed identically and independently with zero mean and finite variance satisfies the conditions of Bartlett's formula.
Hence for large n, the sample correlation at lag i, , is approximately normally distributed with mean (i) and variance cii/n,
Bartlett's formula, (5.4) or (5.5), is extremely useful since it gives us a handle on deciding whether a small value in the sample correlation function is in fact significantly different from zero or is just a result of fluctuations due to the smallness of n. Next we give two examples where Bartlett's formula is used to determine if the sample correlation is zero.
That is, for large n, is normally distributed with mean zero and variance 1/n for i=1, 2, ... , h. This implies that 95 percent of the time the plot of should fall within the bounds . In practice, 2 rather than 1.96 is often used in calculating the bounds.
Here we generate a normally distributed random sequence of length 200 with mean 0 and variance 1.5. The sample correlation function is calculated up to 50.
In[11]:=
In[12]:=
In[13]:=
We can display this sample correlation function along with the bounds using Show.
The sample correlation function and the bounds are displayed here using Show. The function Plot is used to plot the two constant functions that form the bounds.
In[14]:=
Out[14]=
We see that falls within the bounds for all k>0. We have no reason to reject the hypothesis that the set of data constitutes a realization of a white noise process.
You can also define your own function that plots the given correlation function and the bounds. For example, you can define the following.
In[15]:=
In[16]:=
Example 5.3 For an MA(q) process, (k)=0 for k>q. From Bartlett's formula (5.5), it is easy to see that for i>q only the first term in the sum survives. Therefore, for i>q we have
If the data of length n (n large) are truly a realization of an MA(q) process, we expect the sample correlation function for i>q to fall within the bounds given by about 95 percent of the time. In practice, the true correlation function is unknown and (5.6) is used with the sample correlation function in place of .
Here we are given a set of stationary, zero-mean data of length 200 that is generated from an MA(2) process Xt=Zt-0.4Zt-1+1.1Zt-2. We would like to determine the process that generated the data.
In[17]:=
In[18]:=
We first calculate the sample correlation function and plot it along with the bounds for white noise.
In[19]:=
In[20]:=
Out[20]=
Since the sample correlation function at lags 1 and 2, and , are well beyond the bound, we conclude that they differ significantly from zero and the data are not likely to be random noise. Since the correlations beyond lag 2 are all rather small we may suspect that the data can be modeled by an MA(2) process. The variance of for k>2 can be calculated using (5.6), with the sample correlation function replacing the true correlation function, that is, we calculate .
We first get the sample correlation up to k=2. This is done by extracting the first three elements of corr using Take.
In[21]:=
Out[21]=
In[22]:=
Out[22]=
We have subtracted 1 to get rid of overcounting (0) (=1). Now we can display the sample correlation function again with the bounds we just calculated.
In[23]:=
Out[23]=
This affirms the reasonableness of our conclusion that the data are from an MA(2) process.