Department of Economics
& Tornado Alley
Which Are the Cumulative Sum
of Random Disturbances
Consider variables which are of the form
where the U(s)'s are independent variables, random or otherwise.
Now considering averaging over intervals. First take two-period intervals.
Therefore the moving average T(t) is given by
The weight of U(t) in the average is twice that of U(t+1).
The formulas for three-period and four-period averages are
The weight of the first disturbance, U(t), is three and four times, respectively, of the last disturbance in the average.
The general formula is clear
For annual averages the disturbances during Januaries have twelve times the weight of disturbances during Decembers and disturbances on January firsts have 365 times the weght of disturbances occurring on December thirtyfirsts. Likewise for daily averages the disturbances occurring between midnight and 1 A.M. have 24 times the weight of disturbances occurring between 11 P.M. and midnight. This suggests that for statistical analysis it is not a good idea to work with interval averages. Instead the values at a specified point in the interval, say the ends of the interval or the midpoints of the intervals, should be used.
Consider the two-period averages T(t)=½[T(t)+T(t+1)].
it follows that
Because (T(t+1)−T(t)) and (T(t+2)−T(t+1)) both depend upon ½U(t+2) there is a positive serial correlation for the first differences of the averages even if there is no serial correlation for the U(t)'s.
Also since (T(t+1)−T(t)) and T(t) both depend upon U(t+1) there will be a positive correlation between the change in T(t) and its value. There would be no such correlation between the unaveraged T(t) and (T(t+1)−T(t)). Thus averaging introduces spurious correlations into the statistical series.
The serial correlation can extend beyond a one period lag. Consider now an averaging over a three period interval. Then
This means there will be a positive correlation between [T(t+1)−T(t)] and [T(t+2)−T(t+1)] and also between [T(t+1)−T(t)] and [T(t+3)−T(t+2)] because of their common dependencies.
Suppose an estimate of the trend in a variable T(t) is defined by
where T(t) is the average of T over an interval n and T(t) is given by
The true trend is defined to be the common expected value of the U(t)'s; i.e., K=E(U(t)}. One question is whether k is an unbiased estimate of K; i.e. is the expected value of k equal to K? A second question is what is the value of the standard deviation of k and how does it depend upon s and n.
From its definition of T(t) being equal to T(t-1) + U(t) and T(t)=1/n)[T(t)+T(t+1)+T(t+2)+ … +T(t+(n-1))] it was shown previously that
Therefore the difference of the averages is
The term [T(t+s-1)−T(t-1)] is just the sum of the values for U from t to t+s-1. The values for t to t+n-1 correspond to the values in the second summation on the right. Thus, with a little rearrangement,
The question is what is the sum of the weights, wj. Let H is the number of intervals, H=s/n. There are H-1 intervals for which the weights are unity. Therefore the sum of their weights is n(H-1). The weights in the first and last interval can be combined into pairs whose weights sum to unity; therefore the sum of the weights in the first and last intervals is equal to n. Thus the sum of all the weights is equal to n(H-1)+n or nH which is the same as s.
Therefore the expected value of the trend k is
Thus k is an unbiased estimate of K.
For serially uncorrelated U(t)'s the variance σk² of k is given by
where σ² is the common variance of the U(t)'s.
The sum of the squared weights is given by
Thus the variance of k will be larger than would be the case for equal weights by a factor that decreases with s but changes in an uncertain direction with n.
HOME PAGE OF Thayer Watkins