Authors: Robert & Marilyn Hart
Q: How do I use a normal probability
plot to estimate process capability regardless
of the shape of the distribution?
A: A probability plot graphs the relative
cumulative frequency of the data using a plotting
convention and a special normal probability
graph scale. This graph can then be used to
estimate the distribution of the parent population
from which the data came.
The construction of a probability plot was
presented in the November 1999 Quality Practice
Tips. It was noted that the normal probability
plot provides a quick check on normality. However,
the normal probability plot is very useful for
any set of data whether the data are normally
distributed or not.
Example:
The histogram of the lengths of 200 pins is
found in Figure 1. Note that the data are not
normally distributed. This is verified by the
probability plot in Figure 2.
Recall that
| 1. |
the X-axis is a linear scale
of the measurement |
| 2. |
the Y-axis is the cumulative
probability of a piece being as large, or
smaller than the X-axis value |
| 3. |
if the data are normally distributed,
the probability plot will yield a straight
line. |
Figure 1. Histogram of the Lengths of 200 Pins

Figure 2. Probability Plot of the Lengths of
200 Pins
The "best-fit curve" is drawn to
fit the data points in Figure 3. Also in Figure
3, two horizontal lines are drawn from the Y-axis
scale at cumulative probabilities of .00135
and .99865 (0.135% and 99.865%) to where they
intersect the curve. At the points of intersection,
vertical lines are dropped to the X-axis and
the corresponding X values are read. These X
values then estimate the values of the data
distribution that would exceed 0.135% and 99.865%
of the data from the parent population. In Figure
3, these values are approximately 1 and 24.
These values yield the best estimate for the
extent of the "middle" 99.73% (i.e.,
99.865% - 0.135%) of the data.
Figure 3. Probability Plot Displaying the "Process
Capability Limits"
Recall that for a normal distribution, the
"process capability" is often defined
as the mean plus or minus three standard deviations,
i.e., the "middle" 99.73% of the data.
Three standard deviations will not work for
this data set because these data are not normally
distributed.
This procedure can actually be used to find
the X-value corresponding to any cumulative
percentage of the distribution. For instance,
suppose one wants to find the length which will
be exceeded by only 5% of the pins. In Figure
4 it can be seen that approximately 8.3 exceeds
95% of the data.
Figure 4. Probability Plot Displaying Where
95% of the Data Fall Below