Solutions Company Statit Training Home

Estimating Standard Deviation

Q: I need an estimate of the standard deviation of the weights of a production lot of large and bulky parts, but I don't have time or facilities to measure a bunch of parts. Is there any "quick and easy" way to estimate the standard deviation?

A: You can use some properties of the normal distribution to help you easily estimate the standard deviation of a population. This approximation will be pretty good as long as the underlying distribution is fairly normal, as in many situations with process data.

The normal distribution is the familiar, mound-shaped curve from the Statware logo. The normal distribution is frequently found in process data, where the majority of measurements cluster around a central location, the mean, and the number of measurements observed decreases as the distance from the mean increases. A normal distribution curve is shown below (from Montgomery, Introduction to Process Quality Control).

As shown in the figure, 68.26% of the observations of a normal population will be found within 1 standard deviation of the mean. 95.46% of the observations will be found within 2 standard deviations, while 99.73% will be found within 3 standard deviations. Thus, almost 100% of the observations will be observed in a span of six standard deviations, three below the mean and three above the mean. (This is why process capabilities are calculated by dividing the process spread by six.)

To estimate the standard deviation of a population, first determine the largest observation that would be expected to be observed. This should be a real measurement, as opposed to a reading that could be observed due to measurement error. Some processes include a value called the upper reasonable limit that could be used. Next, determine the smallest observation that would be expected to be observed or use the lower reasonable limit.

The estimate of the standard deviation can now be calculated using:

While this estimate is not as reliable as an estimate based on calculations over a large number of parts, it is often useful as a preliminary estimate. Often, the estimate is calculated by dividing by 4 instead of 6, especially if good information is not known about the largest and smallest possible observations. This estimate would only assume that there may be larger or smaller possible observations which are unknown. The use of 4 in the denominator would tend to produce a more conservative estimate of the standard deviation.