Solutions Company Statit Training Home
 



Discussions on Normality

What is normal and how normal does a distribution need to be for a control chart to be effective? The answer, surprisingly, may be "not very".

In this article, I will present two avenues of looking at this issue: first, looking at reports of work on effect of skewness and kurtosis on the control chart parameters and second, looking at the t sigma limits in relation to Shewhart's early work.

Wheeler (2004) refers to Burr's work with skewed distributions in which Burr found that the theoretical value of d2 and d3 for these distributions did not differ significantly from those of a normal distribution of skewness = 0 and kurtosis = 3. As summarized by Wheeler, Burr indicated that highly-skewed and long-tailed distributions could be analyzed with a control chart using the "normal" values of d2 and d3. These parameters, calculated for each of these distributions, did not significantly differ from the values of d2 and d3 for the normal distribution with a skewness of 0 and kurtosis of 3. Wheeler reproduces several of Burr's distributions as well as graphs of the different d2 and d3 values for the skewed distributions. The charts indicate a less than significant difference.

Ryan (2000) also discusses Burr's work and states, regarding Burr's tabulation of constants, "These constants, however, simply facilitate the construction of 3-sigma limits in the presence of nonnormality. The resultant limits are not probability limits and the probability of a point falling outside the limits will, in general, be unknown."

However, the general sense here is that even highly skewed distributions could be effectively analyzed with a standard calculation of a control chart.

When Shewhart (1931, Chapter XIV) presented the control charts he used the Tchebycheff inequality in his discussion of identifying out-of-control processes. The Tchebycheff inequality states that with any distribution (or more generally any set of numbers), at least 1 - (1/t2) of those numbers must fall within the limits of t. Shewhart actually discussed the use of the more conservative Camp-Meidell Inequality, but based his work on the Tchebycheff inequality. Other authors as well have discussed these inequalities as foundations of control chart work.

Grant and Leavenworth (1990, pg. 106-108) discussed the Tchebycheff inequality in their Chapter 3: Why The Control Chart Works: Some Statistical Concepts. Grant and Leavenworth presented a table that shows the number of cases that will fall outside the ts limits for several values of t against a "roughly normal" distribution, the Camp-Meidell inequality and the Tchebycheff inequality. For example, for 3, ~0.27% of the points will be outside the limits for the roughly normal distribution while the Tchebycheff Inequality states that no more than ~11.1% of the cases will fall outside the limits.

Grant and Leavenworth also talk of the Camp-Meidell Inequality in the same section. This inequality states that, for certain distributions, at least 1 - (1/2.25t2) of the points must fall between the t limits. The constraints on these distributions are that the distribution be unimodal, that the mode be the same as the mean and that the "frequencies must decline continuously on each side of the mode". Grant and Leavenworth state that "(m)any of the distributions that are not normal actually come close enough to meeting these conditions for the Camp-Meidell inequality to be applied with confidence."

For the same 3 as above, this inequality tells us that no more than ~4.9% of the cases will fall outside the limits.

Florac and Carleton (1999) also discussed both inequalities and stated that "When we couple these empirical observations with the information provided by Tchebycheff's inequality and the Camp-Meidell inequality, it is safe to say that 3-sigma limits will never cause an excessive rate of false alarms, even when the underlying distribution is distinctly nonnormal." Florac and Carleton are referring to Wheeler's Empirical Rule.

Both inequalities give us only the bounds of the probability but with no indication of what the actual probabilities are. We know from Tchebycheff that for 3 sigma at least 1 - (1/t2) =~.899 of the values must fall within 3. The actual proportion could be higher, but we just don't know what that proportion is. Ryan's comment quoted above certainly applies here as well.

The function of control charts is to indicate the possible presence of assignable cause variation. With a truly normal distribution, the probability that a point is outside of the control limits due to chance causes is much less then for a point outside the control limits in a distribution where only Tchebycheff or Camp-Meidell inequalities apply. We can assign a probability for a normal distribution if we know and . However, if we estimate these parameters, our confidence in this probability suffers.

Control charts can still be effective for even nonnormal distributions; Shewhart showed this. The knowledge that the probability of a Type I error (deciding that the universe has changed when in actuality it hasnt) may be greater because of a less than perfect distribution may temper the level of your efforts to find assignable causes. For 3 limits, up to, but no more than, ~11.1% of your points will show as out-of-control because of chance causes. Without knowing the actual distribution, you don't know the probability that you may be chasing a chance cause of variation. But since we never have a perfect distribution and because we estimate the parameters of the distribution, we never really know what that probability is.

References:

Florac, W.A. & Carleton, A.D. (1999). Measuring the Software Process: Statistical Process Control for Software Process Improvement. Boston: Addison-Wesley
Grant, E.L. & Leavenworth, R.S. (1999). Statistical Quality Control 7th Ed. Boston: McGraw-Hill.
Ryan, T. P. (2000). Statistical Methods for Quality Improvement 2nd Ed. New York: John Wiley & Sons, Inc.
Shewhart, W.A. (1931). Economic Control of Quality of Manufactured Product. New York: D. Van Nostrand Company, Inc
Wheeler, D.J. (2004). Advanced Topics in Statistical Control The Power of Shewhart's Charts (2nd ed.). Knoxville, TN: SPC Press.