Control Charts for Skewed Data
Robert F. Hart, Ph. D.
Marilyn K. Hart, Ph.D.
When encountering the common problem of highly
skewed variables data^{1}, the first
step has been shown to be to transform the data
to attain a "nearnormal" distribution^{2}.
Using the example of 50 consecutive surgery
times from the more recent reference the inverse
transformation (i.e., the reciprocal) of the
surgery times satisfied the need for nearnormality.
The next step is to make control charts to
determine whether the process is stable over
time. If the I chart on the original surgery
times in minutes per procedure (Figure 1) were
made in spite of the fact that the data were
severely skewed, two points would be found above
the upper control limit. The cause(s) of these
outages cannot be determined from this chart.
They may be because of the skewed distribution,
or because the process is not stable over time,
or both.
Figure 1. I Chart on the Original
Data (Surgery Time in Minutes per Procedure)—Mouse
over any data point or other "hot spot"
to view additional information
The cause of the outages in Figure 1 is made
clear by the I chart on the transformed data,
Figure 2, where the plotted values are now in
procedures per minute (rather than minutes per
procedure) owing to the inverse transformation.
Since there is no evidence of instability over
time in Figure 2, one may infer that the process
is stable over time and that the outages in
Figure 1 were solely due to the skewed distribution.
Be aware that because of the inverse transformation,
the three high points in Figure 1 are the three
low points in Figure 2.
Figure 2. I Chart on Transformed
Data (Procedures per Minute)—Mouse
over any data point or other "hot spot"
to view additional information
If this is to be only a retrospective study
to determine stability, the task might be considered
complete. However, if one wants to look at the
process in the original units, minutes per procedure,
the I chart in Figure 3 is required. Here the
plotted points are the same as in Figure 1,
but the control limits are found from "backtransforming"
the results in Figure 2. For example, the UPPER
control limit for Figure 3 is 280.11 minutes
per procedure) is 1/(0.00357 procedures per
minute) where 0.00357 procedures per minute
is the LOWER control limit of Figure 2.
Figure 3 more be easier to explain to others
than is Figure 2. Figure 3 would be preferred
for ongoing process control so that the plotted
points would be as measured rather than having
to take the reciprocal of each before plotting
it.
Figure 3. I Chart on the Original
Data (Minutes per Procedure) with the BackTransformed
Control Limits—Mouse
over any data point or other "hot spot"
to view additional information
Note that even the Xbar chart has an underlying
normality assumption to the calculation of the
control limits. The chart is fairly robust so
that if the data are close to being normally
distributed, the control chart may still work
satisfactorily. However, if the data are severely
skewed, the control chart may give false indications
of lack of control.
References
For more information, contact Drs. Robert and
Marilyn Hart at robthart@aol.com or (541)4120425.
