Control Charts for Skewed TimeBased Data
Q: I would like to monitor the duration
of time that it takes to process a customer
service call. However, a histogram of my data
reveals that there are often some calls which
take much longer than most of the other calls.
Will this hurt my control charts? If so, what
else can I do?
A: Very often, timerelated events are
somewhat problematical since they have a minimum
at zero and are probably clustered around a
particular time, but have some valid data points
with very long service times. Since this type
of data is skewed and nonnormal, conventional
control charts probably should not be used on
the raw data. In particular, an individuals
(X) chart should not be used when there are
serious departures from normality.
Histogram of Original (Skewed) Data
Probability
Plot of Original (Skewed) Data
However, skewed data can be transformed to make
it more nearly normal. You can check for normality
using the normal probability plot and looking
for a nearlystraight line. A common transformation
for highly rightskewed data is the fourth root
of the original data:
In Statit, you can easily create a transformed
variable using the Compute/Transform command
to create a new variable which is equal to the
fourth root of the original variable. This is
equal to the number raised to the onefourth
power, i.e., x**0.25. After transforming, you
should check for normality again. If the data
appears normal, it would then be proper to make
an individuals chart or chart
on the transformed data. This should give you
some information as to whether service times
are "in control." You can "back
transform" the values of the center line
and control limits into real times again, if
desired, to aid in the practical interpretation
of these values.
Histogram of Transformed Data
Probability Plot of Transformed Data
If the 4th root transformation doesn’t
work well to handle the nonnormality, you may
need to try other transformations, such as cube
root, square root, fifth root, or log, depending
on the particular data. Finding the "best"
transformation for a data set often takes a
bit of experimentation.
