Time Series Forecasting

Time Series Forecasting

In this lesson, you’re expected to learn how to analyze time series data in order to extract meaningful statistics and other characteristics of the data. You’ll also find out how to use a model to predict future values based on previously observed values.

The Autocorrelation Function

A very important topic in time series analysis is the concept of autocorrelation. Autocorrelation can be defined as the correlation between the elements of a series and others from the same series separated by a lag or time step.

To understand this concept, let’s think about the temperature evolution over two years. We can expect each time of the series to be positively correlated with the temperature registered one year ago. We would also expect temperature to have small variations from one day to another.

For example, if yesterday we registered 33ºC, it is not probable that tomorrow the temperature will go down to 5ºC. Thus temperature is also correlated to the values observed in the previous days.

On the other hand, the temperature will be negatively correlated with the temperature registered 6 months ago – in Spain, temperatures in January reach its minimum values, while in July we observe the highest records.

Now that we’ve understood the concept of autocorrelation let’s define some important terms.

Lag k (or time interval): Value of the variable registered at time t-k, prior to the observation at time t

Autocorrelation of lag K: The correlation of the variable with its values registered k time instants before. Since it is a measure of correlation, it ranges from -1 to +1. -1 implies a perfect negative correlation, 0 an absence of correlation and 1 a perfect positive correlation.

Thus, the autocorrelation function is a series of autocorrelation values between zero and the maximum step considered.

Usually the visual representation of this function is used to analyze the dependence of the values of a time series on its past records. *

* You’re not required to know this formula though!
The Partial Autocorrelation Function

In time series analysis, the partial autocorrelation function gives the partial correlation of a time series with its own lagged values, controlling for the values of the time series at all shorter lags.

It contrasts with the autocorrelation function, which does not control for the effect of the previous lags.

Look at the following figure. It represents the autocorrelation function for the temperature observed at Heathrow airport. Do you think this figure is real? Does it make sense with the plot you would expect? Why?*
X Axis: Lag (expressed in months)
Y Axis: ACF (Autocorrelation Function)

Enlarged version: http://bit.ly/2mrDrZe
Source: http://www.statsref.com

Yes, the figure can perfectly represent the autocorrelation function for the temperatures of a city or airport.

As discussed before, we would expect to find a positive correlation in the short term, that decreases as the lag increases and reaches its maximum negative value around a lag of six months. After that point, it should increase to reach a new maximum around 12 months. This is exactly what we observe in the figure.

[Optional] What Is Time Series Forecasting?
Decomposition Method
Using the decomposition method, we think of a time series as a series consisting of three components:
– a seasonal component,
– a trend-cycle component (containing both trend and cycle), and
– a noise component (containing anything else in the time series).

The idea is that we can extract an existing pattern from the series, and that existing pattern has two components: the trend and the seasonal component.

There are two types of decomposition models:

1) Additive Model

This model is appropriate when the magnitude of the seasonal component doesn’t change with the level of the time series.

2) Multiplicative Model
The following Table shows when each model is appropriate. For example, for models with a trend, where the amplitude of the seasonal oscillation is proportional to the level of the series, we use multiplicative models.

However, if the amplitude of the variation of the seasonal component are approximately constant and independent, we use an additive model.

Enlarged version: http://bit.ly/2nlXwPZ
The following Table shows when each model is appropriate. For example, for models with a trend, where the amplitude of the seasonal oscillation is proportional to the level of the series, we use multiplicative models.

However, if the amplitude of the variation of the seasonal component are approximately constant and independent, we use an additive model.

Enlarged version: http://bit.ly/2nlXwPZ
– Estimate the seasonal component. It is assumed to be constant with the season. This component is obtained by stringing together all the seasonal indices for each year of data. This gives S^tS^t.
– The remainder component is the irregular component and is calculated by subtracting the estimated seasonal and trend-cycle components:
Multiplicative Models

A classical multiplicative decomposition is very similar to the additive procedure except the subtractions are replaced by divisions.

– Compute the trend
– Calculate the ra between each value and the moving average
– Compute the seasonal component where each element is assumed to be constant during the cycle

The remainder component is the irregular component:
Moving Averages

A moving average model is a model in which each point is an average of the most recent observations in a time series. Thus, for a 4 window moving average, the value of point is the average of the 4 most recent values. This is [(t-3)+(t-2)+(t-1)+t] / 4.

In general, a k-lag moving average is the average value of the k past closer observations.

How do we choose the right value of the lag k?  
To estimate the right k we usually rely on the autocorrelation function. The autocorrelation function of a process becomes zero or has a cutoff at lag k + 1. This is the value of k to use.

Exponential Smoothing

It is a smoothing technique that reduces the noise or irregularities of a time series to provide a more clear view of its underlying patterns. It is a robust approach used to forecast future values.

But how is it different from simple moving average? Why could it be a more robust solution? 

In a simple moving average, all k previous values are equal, while in exponential smoothing the weight of previous timestamps decreases exponentially over time.

When should you not use it? 
When the data has a systematic trend or seasonal component.

[Optional] Time-series methods of forecasting
Jim Rohn Sứ mệnh khởi nghiệp