Cpk vs. Ppk
Q: What is the difference between the
Ppk values reported by Statit and the Cpk values?
Why are they both reported? Which one is correct?
A: For Pp and Ppk calculations, the
standard deviation used in the denominator is
based on all of the data evaluated as one sample,
without regard to any subgrouping. This is sometimes
referred to as the overall standard deviation,
ó_{total}.
For Cp and Cpk calculations, the standard deviation
is based on subgroups of the data using subgroups
ranges, standard deviations or moving ranges.
This "withinsubgroup" process variation
can be considerably smaller than the overall
standard deviation estimate, especially when
there are longterm trends in the data.
When there are slow fluctuations
or trends in the data, the estimate of the process
variability based on the subgroups can be smaller
than the estimate using all of the process data
as one sample. This often occurs when the differences
among observations within the subgroup are small,
but the range of the entire dataset is significantly
larger. Since the withinsubgroup variation
measures tend to ignore the range of the entire
group, they can underestimate the overall process
variation.
All of the observations and their variability
as a group are what is important when characterizing
the capability of a process to stay within the
specification limits over time. Underestimating
the variability will increase the process capability
estimate represented by Cp or Cpk. However,
these estimates may not be truly representative
of the process.
The following box plot shows data where the
within group variability is small, but there
are both upward and downward trends in the data.
There are a significant number of observations
beyond the specification limits.
When the Process Capability procedure in Statit
is performed based on this data, there are significant
differences between the estimates of Pp and
Cp (and, analogously, Ppk and Cpk).
For example, the calculated Cpk, which uses
the withinsubgroup estimate of the process
variability is 1.077. This would typically be
considered to represent a marginally capable
process  one with only about 0.12% of
the output beyond the specifications (12 out
of 1000 parts). However, the calculated Ppk
value, which uses the variability estimate of
the total sample, is only 0.672. This would
indicate a process that is not capable and probably
produces a high percentage of output beyond
the specifications. Note that the actual amount
of production beyond the specifications is 5%
or roughly 1 out of every 20 parts.
Which of these values are correct? Both are
calculated correctly according to their equations,
but here the Ppk value is probably the most
representative of the ability of the process
to produce parts within the specifications.
Note: One way to determine that the variability
estimate is not truly representative of the
process is to compare the Estimated and Actual
values for the Product beyond Specifications
in the Statit output. If the estimated percentage
of samples beyond specification is significantly
different than the actual percentage reported,
then more investigation and analysis of the
data would be warranted to achieve the best
Process Capability estimates possible based
on the data.
