Demand forecasting takes into account many changes, such as demand seasonality, by tracking patterns in historical data. Outliers are points in historical data that deviate significantly from other data points. They can be unusually large or small. They are not often considered part of the overall demand pattern because they lie far outside of the expected data range.
Outliers can be unpredictable, but they also can occur due to planned events, such as sales promotions. and unanticipated ones, including competitor promotions and natural disasters. They also can be caused by errors in measurements or data entry.
If businesses ignore outliers, their forecasts could be skewed, which affects profitability. They can detect outliers using visualization tools or mathematical/statistical methods.
Following are examples of simple tools to visualize data and spot outliers:
- Histogram: A histogram is used to check for outliers in univariate data, data that contains a single variable. It divides the range of values into various groups and then shows the frequency with which the data falls into each group. When these groups are arranged in increasing order, outliers stand out either at the far left, indicating very small values, or at the far right, signaling very large values.
- Scatter plot: A scatter plot is used to check outliers among bivariate data, data that contains two connected variables. It displays a collection of points on an X-Y coordinate axes, where the X or horizontal axis represents one variable and the Y or vertical axis, the other. The outliers appear away from the majority of points on the scatter plot.
Statistical techniques are more thorough in identifying outliers. Following are some common examples:
- Altman’s Z-score: The Z-score indicates the standard deviations of data points from the mean (average) of a group of values. If a Z-score is 0, it indicates that the data point is identical to the mean. The data points that are too far from a zero Z-score are treated as the outliers. Usually, if the Z-score value of some data points is greater than 3 or less than -3, they are considered outliers.
- Interquartile range (IQR): The IQR is the difference between the first quartile or 25th percentile and the third quartile or 75th percentile of an ordered range of data. It contains 50 percent in the middle of the distribution. IQR is considered more robust than most other methods.
Businesses can deal with outliers in many ways:
- Resurvey or delete outliers: Resurvey outliers or delete them, if they are caused by incorrect measurement or entry, or affect either results or assumptions concerned with the data set.
- Replace outliers: Reduce the outlier’s effect by replacing it with the average of two values from periods immediately before and after the period in question. This is useful if outliers are recent occurrences or are in the middle of the data. If the outliers are from the distant past, businesses can ignore them and the data points up to them. Correcting a severe outlier often improves the forecast. However, corrections for mild outliers may cause harm.
- Separate the demand streams: Split the time series and do separate forecasts for the two resultant demand streams when you know at what point the outlier occurred. If the demand streams are difficult to split, you should manage them with a single time series.
- Forecast by modeling outliers: Consider forecasting by recreating those events as exactly as possible if you know the events that caused the outliers.
- Winsorization: Lessen the outliers’ influence by either assigning them lower weights or changing their values to something more representative of the data set.
Extreme outliers in demand data can affect forecasts, and businesses suffer losses when their forecasts go awry. Businesses should employ apt methods and technologies to identify and handle outliers.