An Outlier In The Confidence Interval
In today's department meeting, I had shared this phenomenon happened in my demo materials for our India customer. An outlier in the figure is showed as a small red triangle.
The blue line just above the outlier is the upper historical limit. That represents that the outlier is in the historical interval. The historical interval is like as what we know about the confidence interval. The historical interval is for a fitted values of the model and has its confidence level 95%.
The outlier happened in the historical interval. There is a little bit conflict with my statistical sense. As our common sense, an outlier is the data out of the range what we expect. At the same time, about the range what we expect, we in general use the confidence interval to represent. Intuitively, an outlier would be out of the interval. But, in practice, why it is in it?
Numerically, the outlier is detected by the software because its fitted error (actual value - fitted value) is the maximum of all fitted errors and over 3 times of standard deviation of the all fitted errors from the initial model with the uncorrected data. About the interval, the software use the standard deviation of the fitted errors of the model to construct it and at the same time the interval is increased with the level because of the model's trend( in fact, it is exponential smoothing with linear trend and multiplicative seasonality). Therefore, the value of an outliers could be included in the interval.
Above is a complicated explanation for the cause of the phenomenon. The main reason is that the interval is increased with the trend of the model. Numerically, I can understand that it could be happened because the historical interval and outliers are different process to produce. Therefore, in my case, the corresponding fitted error of outlier is smaller than the historical interval.
But, logically, how can we explain it? Suppose, I construct a confidence interval with 95% level, say, my daughter exam score is between 72 and 88. After the exam, she get 80. The score is in the interval. But the score is an outlier. Mmmmm.....it's far from my expectation, not imagination.
The blue line just above the outlier is the upper historical limit. That represents that the outlier is in the historical interval. The historical interval is like as what we know about the confidence interval. The historical interval is for a fitted values of the model and has its confidence level 95%.
The outlier happened in the historical interval. There is a little bit conflict with my statistical sense. As our common sense, an outlier is the data out of the range what we expect. At the same time, about the range what we expect, we in general use the confidence interval to represent. Intuitively, an outlier would be out of the interval. But, in practice, why it is in it?
Numerically, the outlier is detected by the software because its fitted error (actual value - fitted value) is the maximum of all fitted errors and over 3 times of standard deviation of the all fitted errors from the initial model with the uncorrected data. About the interval, the software use the standard deviation of the fitted errors of the model to construct it and at the same time the interval is increased with the level because of the model's trend( in fact, it is exponential smoothing with linear trend and multiplicative seasonality). Therefore, the value of an outliers could be included in the interval.
Above is a complicated explanation for the cause of the phenomenon. The main reason is that the interval is increased with the trend of the model. Numerically, I can understand that it could be happened because the historical interval and outliers are different process to produce. Therefore, in my case, the corresponding fitted error of outlier is smaller than the historical interval.
But, logically, how can we explain it? Suppose, I construct a confidence interval with 95% level, say, my daughter exam score is between 72 and 88. After the exam, she get 80. The score is in the interval. But the score is an outlier. Mmmmm.....it's far from my expectation, not imagination.
留言
張貼留言