(Fourth in a series)
In last week’s Forecast Friday post, we discussed moving average forecasting methods, both simple and weighted. When a time series is stationary, that is, exhibits no discernable trend or seasonality and is subject only to the randomness of everyday existence, then moving average methods – or even a simple average of the entire series – are useful for forecasting the next few periods. However, most time series are anything but stationary: retail sales have trend, seasonal, and cyclical elements, while public utilities have trend and seasonal components that impact the usage of electricity and heat. Hence, moving average forecasting approaches may provide less than desirable results. Moreover, the most recent sales figures typically are more indicative of future sales, so there is often a need to have a forecasting system that places greater weight on more recent observations. Enter exponential smoothing.
Unlike moving average models, which use a fixed number of the most recent values in the time series for smoothing and forecasting, exponential smoothing incorporates all values time series, placing the heaviest weight on the current data, and weights on older observations that diminish exponentially over time. Because of the emphasis on all previous periods in the data set, the exponential smoothing model is recursive. When a time series exhibits no strong or discernable seasonality or trend, the simplest form of exponential smoothing – single exponential smoothing – can be applied. The formula for single exponential smoothing is:
Ŷ_{t+1 }= αY_{t} + (1α) Ŷ_{t}_{ }
In this equation, Ŷ_{t+1} represents the forecast value for period t + 1; Y_{t} is the actual value of the current period, t; Ŷ_{t} is the forecast value for the current period, t; and α is the smoothing constant, or alpha, a number between 0 and 1. Alpha is the weight you assign to the most recent observation in your time series. Essentially, you are basing your forecast for the next period on the actual value for this period, and the value you forecasted for this period, which in turn was based on forecasts for periods before that.
Let’s assume you’ve been in business for 10 weeks and want to forecast sales for the 11th week. Sales for those first 10 weeks are:
Week (t) 
Sales (Y_{t}) 
1 
200 
2 
215 
3 
210 
4 
220 
5 
230 
6 
220 
7 
235 
8 
215 
9 
220 
10 
210 
From the equation above, you know that in order to come up with a forecast for week 11, you need forecasted values for weeks 10, 9, and all the way down to week 1. You also know that week 1 does not have any preceding period, so it cannot be forecasting. And, you need to determine the smoothing constant, or alpha, to use for your forecasts.
Determining the Initial Forecast
The first step in constructing your exponential smoothing model is to generate a forecast value for the first period in your time series. The most common practice is to set the forecasted value of week 1 equal to the actual value, 200, which we will do in our example. Another approach would be that if you have prior sales data to this, but are not using it in your construction of the model, you might take an average of a couple of immediately prior periods and use that as the forecast. How you determine your initial forecast is subjective.
How Big Should Alpha Be?
This too is a judgment call, and finding the appropriate alpha is subject to trial and error. Generally, if your time series is very stable, a small α is appropriate. Visual inspection of your sales on a graph is also useful in trying to pinpoint an alpha to start with. Why is the size of α important? Because the closer α is to 1, the more weight that is assigned to the most recent value in determining your forecast, the more rapidly your forecast adjusts to patterns in your time series and the less smoothing that occurs. Likewise, the closer α is to 0, the more weight that is placed on earlier observations in determining the forecast, the more slowly your forecast adjusts to patterns in the time series, and the more smoothing that occurs. Let’s visually inspect the 10 weeks of sales:
The Exponential Smoothing Process
The sales appear somewhat jagged, oscillating between 200 and 235. Let’s start with an alpha of 0.5. That gives us the following table:
Week (t) 
Sales (Y_{t}) 
Forecast for This Period (Ŷ_{t}) 
1 
200 
200.0 
2 
215 
200.0 
3 
210 
207.5 
4 
220 
208.8 
5 
230 
214.4 
6 
220 
222.2 
7 
235 
221.1 
8 
215 
228.0 
9 
220 
221.5 
10 
210 
220.8 
Notice how, even though your forecasts aren’t precise, when your actual value for a particular week is higher than what you forecasted (weeks 2 through 5, for example), your forecasts for each of the subsequent weeks (weeks 3 through 6) adjust upward; when your actual values are lower than your forecast (e.g., weeks 6, 8, 9, and 10), your forecasts for the following week adjusts downward. Also notice that, as you move to later periods, your earlier forecasts play less and less of a role in your later forecasts, as their weight diminishes exponentially. Just by looking at the table above, you know that the forecast for week 11 will be lower than 220.8, your forecast for week 10:
Ŷ_{11 }= 0.5Y_{10} + (10.5) Ŷ_{10 }
= 0.5(210) + 0.5(220.8)
= 105 + 110.4
=215.4
So, based on our alpha and our past sales, our best guess is that sales in week 11 will be 215.4. Take a look at the graph of actual vs. forecasted sales for weeks 110:
Notice that the forecasted sales are smoother than actual, and you can see how the forecasted sales line adjusts to spikes and dips in the actual sales time series.
What if we Had Used a Smaller or Larger Alpha?
We’ll demonstrate by using both an alpha of .30 and one of .70. That gives us the following table and graph:
Week (t) 
Sales (Y_{t}) 
Forecast α=0.50 
Forecast α=0.30 
Forecast α=0.70 
1 
200 
200.0 
200.0 
200.0 
2 
215 
200.0 
200.0 
200.0 
3 
210 
207.5 
204.5 
210.5 
4 
220 
208.8 
206.2 
210.2 
5 
230 
214.4 
210.3 
217.0 
6 
220 
222.2 
216.2 
226.1 
7 
235 
221.1 
217.3 
221.8 
8 
215 
228.0 
222.6 
231.1 
9 
220 
221.5 
220.4 
219.8 
10 
210 
220.8 
220.2 
219.9 
As you can see, the smaller the α, the smoother the curve for forecasted sales; the larger the α, the bumpier the curve, as you can see as you move from .30 to .50 to .70. Notice how much faster an α of .70 adjusts to the actual sales than the smaller α’s. The forecasts for week 11 would be 217.2 with an α=.30 and 213 with an α=.70.
Which α is best?
As with moving average models, the Mean Absolute Deviation (MAD) can be used to determining which alpha best fits the data. The MADs for each alpha are computed below:
Week 
Absolute Deviations 

α=.30 
α=.50 
α=.70 

1 
– 
– 
– 
2 
15.0 
15.0 
15.0 
3 
5.5 
2.5 
0.5 
4 
13.9 
11.3 
9.8 
5 
19.7 
15.6 
13.0 
6 
3.8 
2.2 
6.1 
7 
17.7 
13.9 
13.2 
8 
7.6 
13.0 
16.1 
9 
0.4 
1.5 
0.2 
10 
10.2 
10.8 
9.9 
MAD= 
9.4 
8.6 
8.4 
Using an alpha of 0.70, we end up with the lowest MAD of the three constants. Keep in mind that judging the dependability of forecasts isn’t always about minimizing MAD. MAD, after all, is an average of deviations. Notice how dramatically the absolute deviations for each of the alphas change from week to week. Forecasts might be more reliable using an alpha that produces a higher MAD, but has less variance among its individual deviations.
Limits on Exponential Smoothing
Exponential smoothing is not intended for longterm forecasting. Usually it is used to predict one or two, but rarely more than three periods ahead. Also, if there is a sudden drastic change in the level of sales or values, and the time series continues at that new level, then the algorithm will be slow to catch up with the sudden change. Hence, there will be greater forecasting error. In situations like that, it would be best to ignore the previous periods before the change, and begin the exponential smoothing process with the new level. Finally, this post discussed single exponential smoothing, which is used when there is no noticeable seasonality or trend in the data. When there is a noticeable trend or seasonal pattern in the data, single exponential smoothing will yield significant forecast error. Double exponential smoothing is needed here to adjust for those patterns. We will cover double exponential smoothing in next week’s Forecast Friday post.
Still don’t know why our Forecast Friday posts appear on Thursday? Find out at: http://tinyurl.com/26cm6ma
Tags: double exponential smoothing, exponential smoothing, forecast, Forecast Friday, Forecasting, forecasting methods, mean absolute deviation, moving average, recursive formulas, simple moving average, single exponential smoothing, smoothing, smoothing constant, stationary data, time series, time series analysis, weighted moving average
May 13, 2010 at 11:50 am 
Alex,
Methods like exponential smoothing are used for a couple of reasons:
1)They are easy to compute and fast
2)They are easy to understand as they are simple
The big downfall is:
1)They violate the assumptions on which the modeling process is based upon where the residuals from the model are Normally independently identically distributed(N.I.I.D). This means that these models don’t care to model the data as they are just trying to fit the data based on minimizing some statistic. For example, if there is an an outlier they will get skewed by the value and the forecast
2)You can’t bring in causal variables like price, promotion, holidays, events, “fixed effects”, etc.
May 13, 2010 at 1:09 pm 
Tom,
Those are excellent points you make. Outliers are always a problem and data is always dirty and requires adjustments in those circumstances.
In my experience, shortterm forecasting techniques like exponential smoothing are often used solely for prediction – say to determine how many units of materials to purchase next week. Most times the users don’t necessarily care what is driving the forecast; they just want to know how to plan for the period or two ahead. When they do care about causal factors, it’s because new changes in operations, marketing, or other functions are being implemented, which will affect the future course of the time series and the forecasts they generate. Quite often, causal variables become a factor in longerterm forecasting, or in microlevel predictive models that get incorporated into forecast models.
Each method has its place. Many decision makers combine forecasting methods to arrive at a final composite forecast, incorporating qualitative judgment throughout. Your observations make one thing perfectly clear: models should be used to aid – not replace – the decision process. At the end of the day, people – not models – make the decisions.
May 21, 2010 at 6:13 pm 
How would you forecast more than one period ahead given that the formula for y_t+1 requires both the actual and the forecast value of y_t?
Thanks!
May 22, 2010 at 1:02 pm 
Brian,
You’ve easily figured out that exponential smoothing will produce the same forecast for the second, third, fourth (and onward) periods ahead, as it did the first period ahead. As you can see, exponential smoothing is intended only for shortterm forecasting. This is a good reason why you need to have good intuition about your business’ operations and use these forecasting tools to aid – but not replace – your decision making.
There are some approaches you can try, all subjective and none entirely satisfying. You might generate the forecast for period t+1 using exponential smoothing, and then assume that actual will be the same as in period t, or a moving (or weighted moving) average of periods t, t1, and t2. Then you would “guesstimate” a forecast for period t+2 using exponential smoothing. And you would repeat the process for forecasting period t+3, t+4, etc.
You might also poll your fellow decision makers on where they think actual sales for periods t+1, t+2, and t+3 will be, based on their experience. Then you average their estimates, use them as actuals and then compare them to exponential smoothing.
The problem in both of these examples is that you’re generating forecasts based on forecasts. Hence, the potential for forecast error is high. But as long as your assumptions are aligned with your business’ practices and daytoday realities, and as long as you’re using the exponential smoothing technique for shortterm forecasting, it can still provide valuable planning insights.
Readers, would you like to share your thoughts on what Brian can do here? You’re more than welcome to weigh in!
December 3, 2010 at 7:19 am 
There are numerous techniques that can be used to accomplish the goal of forecasting. Exponential smoothing is one of the best technique which is used when sales numbers shows some relevance to future sales where it get weightage.
January 4, 2011 at 12:04 am 
[…] Forecast Friday Topic: Exponential Smoothing Methods May 2010 5 comments 5 […]
January 13, 2011 at 12:02 am 
[…] moving average methods less accurate for shortterm forecasting, which led into our discussion of exponential smoothing. When the time series exhibited a trend, we relied upon double exponential smoothing to adjust for […]