03

Business forecasting 101

Forecast Fit

Residuals indicate the difference between your chosen forecasting method and actuals. You can look at residuals over time and their distribution to understand how well the chosen forecast method fits to your historic data.

A residual time graph shows the difference between forecasts (red line) and actuals (blue line).

Reading a Residual Time Graph
The residual time graph illustrates the difference between forecast values and your actual historical data over time.

Ideally, a residual graph will look like noise. It should contain no discernible patterns or repeated column groupings. Otherwise, the graph indicates that the forecasting method is not picking up on seasonality, recurring promotions, price changes, or other events. You may also be missing exponential growth or decay.

Reading a Residual Distribution Graph
The peak of this graph should always be centered at or around zero. If it is not peaked near zero, it indicates that your forecasting method is biased, meaning it is systematically over- or under-forecasting. For example, your data might contain a growing or declining trend but is being modeled with a level method. Alternatively, your data might contain exponential growth but is being modeled with a linear trend method.

If your graph does not have a bell shape (e.g. hill shape), it most likely means that you have outliers that are skewing your bell curve to have a longer tale on one side versus the other. In some cases, you might just have too little data to create a smooth distribution graph; in this case you should ignore the graph until you have a more representative sample of residuals.

Forecast fit refers to how successfully your chosen forecast method fits to your actuals. A forecast is considered a good fit if it captures all patterns and trends, but ignores random noise.

To determine whether your forecast method fits well, check out the following:
– Forecast Fit – Residual Analysis
– Out of Sample Testing / Holdout Sample
– Forecast Error

Vanguard business forecasting applications display the forecast, actuals, residuals, errors, and the holdout sample for a complete view of your forecast fit.

Out-of-sample testing is a popular way to test the likely accuracy of a forecasting method. It is unbiased since it’s stripped of all adjustments and filters.

To conduct this test, take out the most recent periods of demand history (the holdout sample) as if it did not exist. The number of time periods that you remove should correlate to your normal forecast horizon (e.g. if you forecast three months into the future, the holdout should be at least three months long). You can then apply different forecasting methods to see what the holdout sample error is or Mean Absolute Deviation/Error (MAD or MAE). This will tend to be higher than the mean error. The method with the lowest MAD will likely be more accurate.

The holdout sample is strictly anecdotal data since it covers only a limited number of periods. So while it’s helpful to test different methods, it does not, by itself, determine which method is best.

A common misconception about forecasting is that straight-line forecasts are poor forecasts because your data is not a straight line. Your data usually has inherent variability, making it jump around. Sometimes that variability exhibits a trend or pattern and sometimes it is just “noise”. In the absence of real patterns, the best forecast might just be a straight line.

A common misconception about forecasting is that straight-line forecasts are poor forecasts because your data is not a straight line. Your data usually has inherent variability, making it jump around. Sometimes that variability exhibits a trend or pattern and sometimes it is just “noise”. In the absence of real patterns, the best forecast might just be a straight line.

Symmetric Mean Absolute Percent Error (SMAPE) is an alternative to Mean Absolute Percent Error (MAPE) when there are zero or near-zero demand for items. SMAPE self-limits to an error rate of 200%, reducing the influence of these low volume items. Low volume items are problematic because they could otherwise have infinitely high error rates that skew the overall error rate.

SMAPE is the forecast minus actuals divided by the sum of forecasts and actuals as expressed in this formula:

Mean Absolute Percent Error (MAPE) is the most common measure of forecast error. MAPE functions best when there are no extremes to the data (including zeros).

With zeros or near-zeros, MAPE can give a distorted picture of error. The error on a near-zero item can be infinitely high, causing a distortion to the overall error rate when it is averaged in. For forecasts of items that are near or at zero volume, Symmetric Mean Absolute Percent Error (SMAPE) is a better measure.

MAPE is the average absolute percent error for each time period or forecast minus actuals divided by actuals:

The Last Absolute Deviation Z-Score is the Vanguard-recommended approach to getting an actionable measure of forecast error. This measure is very useful in checking your historical data for anomalies in the last period.

Z-Score, unlike other measures, relies on the most recent data rather than an average of all historic forecasts and actuals. The Z-Score represents the standard deviation from the last period’s forecasts and its actuals. A Z-Score or standard deviation of two or more indicates a recent significant structural change. Was an item discontinued? Did a large new client recently join? Did a competitor exit, causing sales to ramp up?

The Z-score is obtained using this formula:

Both the Mean Absolute Deviation (MAD) and the Mean Absolute Error (MAE) refer to the same method for measuring forecast error.

MAD is most useful when linked to revenue, APS, COGS or some other independent measure of value. MAD can reveal which high-value forecasts are causing higher error rates.

MAD takes the absolute value of forecast errors and averages them over the entirety of the forecast time periods. Taking an absolute value of a number disregards whether the number is negative or positive and, in this case, avoids the positives and negatives canceling each other out.

MAD is obtained by using the following formula:

Mean Absolute Deviation/Error (MAD or MAE) and divides it by the average of the actuals. When choosing between error methods, MADP is preferable to MAPE because it does not skew error rates approaching or at zero. MADP is represented by the following formula: