Robert F. Hart, Ph.D.
Marilyn K. Hart, Ph.D.
Before one can make a valid control chart for
variables data (a.k.a. measurement data or continuous
data), it is necessary for the data distribution
to be "near-normal" [Testing for "Near
-Normality...", September, 2004]. However,
it is common in health care for the data distribution
to be unsymmetrical with a long tail of high
values. Such a distribution is said to be "skewed
to the right" and instead of the points
on the probability plot tending to lie along
a straight line, they tend to fall along a smooth
curve which is convex upward.
Consider, for example, surgery times. There
is some minimum time that the surgery will take,
and it obviously cannot take less than 0 minutes.
However, there is no real upper bound. Table
1 gives the surgery times for 50 consecutive
procedures. Such data are typically severely
skewed to the right as is the case here. This
is reflected in the probability plot, Figure
1.
Table 1. Surgery Times
|
Surgery Number
|
Surgery Time
(in minutes)
|
Surgery
Number
|
Surgery Time
(in minutes)
|
| 1 |
75 |
26 |
130 |
| 2 |
65 |
27 |
115 |
| 3 |
165 |
28 |
75 |
| 4 |
60 |
29 |
95 |
| 5 |
75 |
30 |
80 |
| 6 |
75 |
31 |
125 |
| 7 |
85 |
32 |
105 |
| 8 |
80 |
33 |
70 |
| 9 |
85 |
34 |
72 |
| 10 |
95 |
35 |
95 |
| 11 |
65 |
36 |
90 |
| 12 |
65 |
37 |
120 |
| 13 |
85 |
38 |
75 |
| 14 |
68 |
39 |
90 |
| 15 |
190 |
40 |
85 |
| 16 |
120 |
41 |
90 |
| 17 |
105 |
42 |
80 |
| 18 |
115 |
43 |
115 |
| 19 |
58 |
44 |
80 |
| 20 |
70 |
45 |
65 |
| 21 |
80 |
46 |
70 |
| 22 |
75 |
47 |
485 |
| 23 |
65 |
48 |
75 |
| 24 |
75 |
49 |
90 |
| 25 |
90 |
50 |
120 |
Figure
1. Probability Plot of Surgery Times in Minutes
A convenient way to handle this problem of
data skewed to the right is to "transform"
the data into a data set which has a near-normal
distribution [Hart and Hart, 2002]. This may
often be accomplished by choosing a particular
power of the data to make it meet the assumption
(where the zero power is taken to be the natural
logarithm). This family of transformations may
be expressed as transf(X) = Xp, where
X and transf(X) are the original and transformed
data respectively and p is the power to which
X is raised. The more severely X is skewed to
the right, the lower the value of p required
to obtain a near-normal transformation. A suitable
transformation is found by trial and error.
Experience has shown that one of three trial
transformations will often be satisfactory;
p = 0.25 (the fourth root), the natural logarithm,
or p = -1 (the reciprocal).
The reader is encouraged to make a number of
trial transformations to become familiar with
the method. The transformation chosen here is
transf(X) = X-1, X to the -1 power
which is the reciprocal of X. This transformation
provides the needed near-normal distribution
(as shown in Figure 2) and makes physical sense
in that the transformed variable is procedures
per minute where the original data was expressed
in minutes per procedure.

Figure 2. Probability Plot of Transformed Surgery
Times, Procedures per Minute
A future paper will consider the control charts
of the original data and the transformed data,
as well as exploring the "back-transformation"
of the results to make a valid control chart
in the original units, procedures per minute.
References
M.
Hart and R. Hart, Statistical Process Control
for Health Care, Pacific Grove, CA: Duxbury,
2002.
M.
Hart and R. Hart. "Testing for 'Near-Normality':
the Probability Plot ", Statit Bulletin,
September, 2004