As I travel around to different manufacturing
sites, I often see signs that say something
like, "45 days without a lost time accident".
But what do these signs tell us? From the standpoint
of improving our accident rate, not a lot. There
is no history and no idea if we are improving
or getting worse. But what happens if we can
monitor this process with a Statistical Process
Control Chart? Then we could be on the road
to process improvement.
Events like lost-time accidents and chemical
spills are (hopefully) rare events in the manufacturing
plant. As such, they are difficult to analyze
with the standard Shewhart charts, such as c,
p, or u chart as you can see from the following
example of a c chart on chemical spills.

Figure 1
Looking at this chart, we really cant
tell what our process is doing. We are looking
at the number of spills in a month time frame.
Obviously, this chart will not be of much use
to us. Even if we increased the time frame to
quarters, we would not have a chart that we
could really use for decisions on our processes.
And we would not have timely information for
detecting quickly that something was going wrong.
If you explore this with other charts, such
as the p, u or even the i chart, you will find
the same situation; no real information that
you could use for decisions.
With this chart we are looking at a somewhat
arbitrary subgroup (month) and we would need
more spills in the month to get a signal. However,
suppose we had spills that were grouped around
the end of the month. We may have had a cluster
of spills that we would like to know about but
because of the arbitrary monthly subgroup, we
might not detect it.
In the manufacturing environment, we may have
a number of events that happen very rarely,
and we would like to keep it that way. Some
examples of these events might be things like
lost-time accidents, chemical spills, quality
events or perhaps unexpected downtime. Or perhaps
we would like to look at rare, but critical,
defects. In a high yield manufacturing process,
we would not have a lot of defects, so this
may be an area we could talk about as well as
rare events.
These type of data are what can be called rare
events. How do we spot rare event data?
First, the definition of rare events could
be functional. Rare events are basically those
events that are so infrequent that they do not
fit the models of standard SPC charts. The data
are not Poisson, Binomial or Normal. Perhaps
the data are so infrequent that we cannot get
a good feel for the type of distribution; we
would not have enough data to model the distribution.
You could think of rare events as those with
very few occurrences over a relatively long
time period. For chemical spills, it might be
some small number of events in a year. The same
might be true of lost time injuries or unexpected
down time. For high yield manufacturing, we
may have very few of the critical defects such
that we have several batches with no defects,
followed by a batch with one critical defect.
In this case, the time frame would be shorter
than the previous examples but relatively long
in terms of the batches. This would be a situational
determination of the definition of rare events.
By standard SPC, we would need to decide on
the subgroups size. Often this would by based
on some time period over which we are counting
rare events. For the spills we might count the
number of spills in a month again with the caveat
of arbitrary subgrouping. For the batch situation,
we might look at defects per day of production.
But we may also look at defects per batch as
our subgroup.
But lets try to put some decision criteria
to whether we have a rare event situation. Basically,
we could consider those infrequent events where
the assumptions of the standard SPC chart are
not met. Rare events are those that:
We need to keep the subgroup time frame fairly
reasonable, otherwise we would need to wait
too long to make the decision. And if we use
a large denominator for a p or u chart with
a small numerator, we could get into another
issue called overdispersion.
In the standard charts, we dont really
have good options to analyze these data. So
how do we monitor these data if standard SPC
charts do not do a good job of this?
As an example, lets look at chemical
spills data used by Wheeler as illustrated in
the chart in Figure 1. There were eight spills
over four years. A c chart on this data is shown
in Figure 1. As weve discussed, this is
not a helpful chart. We cannot make good decisions
based on this type of information. We dont
know if we are getting more spills or not. And
it would not be different with other standard
SPC charts either.
Why is this? Well, we are looking at count
data and we are looking at a subgroup of one
month. As we can see from this chart, the average
spills per month is less than 1 at .151. When
we are looking at count data with an average
count of less than 1, the count charts become
ineffective. At that point, the data is too
discrete, as we can see from this example.
Notice the relation of the single spill to
the average. It is almost 7 times the average
but still does not show as an out of control
point. This illustrates that the c, u, p and
NP charts become very insensitive for these
type of data. Statistically, it does not fit
the c/u chart model of the Poisson distribution.
While not shown here, there are similar issues
with other charts. For example, data for the
p chart should be such that NP>= 5. If we
have a p of .01, that would mean that we would
need 500 points per subgroup. If we were trying
for 20-25 points on a chart, we would need something
like 12,500 total data.
However some alternatives have been proposed.
Rate on an XmR Chart
The first one we will explore is discussed
in Wheelers Advanced Topics in Statistical
Process Control and uses the spills data.
This method calculates the days between spills
and converts it to a rate of spills per year,
a rate that can then be charted with an XmR
chart.
You first calculate the number of days between
spills. To use this information with usual SPC
charts you would then convert this to a rate.
For these data, a convenient rate might be Spills
per year. So we would divide 1 by the number
of days between to arrive at Spills per day.
Multiply that by 365 to get Spills per year.
This information could be then displayed with
an XmR chart.
You can see my method of calculation at live.statit.com
under the Rare Event Webinar. Choose the Calculate
and XmR macro. In the bottom of the display
window is a View Macro Source link that will
show you the Statit language to build this chart.
With the XmR chart we would certainly like
to have the rate as low as possible. A lower
rate translates to a larger number of days between
events.
Figure 2
(mouse over any data point to see tips)
Depending on the Spill rate, you may want to
adjust this chart to get something that is more
meaningful to the viewers by using a different
time frame. For example, is a rate of 1.82 spills
per year meaningful?
Note that it would also be possible to calculate
the batches between and then calculate the rate
as defects / batch rate. We could also calculate
parts between.
We use two charts. Since we are using the I
chart we often want the companion Moving Range
chart. Also remember that the I chart is most
effective when we are working with a distribution
fairly close to normal. We may be way off base
with that assumption.
g Chart
The other alternative is what is called the
g chart. This chart has been increasingly used
in healthcare, where they need to monitor rare
events like Surgical Site Infections and other
events known as Never Events, those that should
never happen. But it was originally designed
for Dr. James Benneyan for manufacturing applications.
The article in the references at the bottom
of this presentation contains the justification
of the g chart. It has proved very effective
in healthcare data.
The idea of the g chart is that these types
of events are more closely modeled on the geometric
distribution. The geometric distribution is
the memory-less discrete distribution. To give
you an idea of a geometric distribution, suppose
you play a game where you toss a coin until
a head comes up. The number of tosses is X.
If you repeat this game several times recording
X each time, the distribution of this random
variable is the geometric distribution. The
geometric distribution is the discrete version
of the exponential distribution. This distribution
would model the coin flips until or between
heads.
Often with the data we are talking about (8
spills in 4 years), it may be difficult to ascertain
that we are looking at a geometric distribution,
simply because we do not have enough data. This
is approximately what a histogram of a geometric
distribution would look like.
Figure 3
Obviously it would take a large number of spills
or accidents to show that it is geometric.
The g chart measures time between events or
occurrences between events. The control limits
are based on the same principles as other charts,
but are calculated based on the properties of
the geometric distribution. So we might look
at something like Days Between or Hours Between
or Batches Between. A g chart could also looks
at occurrences between, such as parts produced
between defects. This is an example of the g
chart.
Figure 4
(mouse over any data point to see tips)
Notice that with the g chart, we want the days
between to be higher. So for the g chart, you
see the Good Direction at the top. And that
is an important distinction between the XmR
chart of the first alternative and the g chart
alternative. With the XmR chart, we are looking
at the rate and we want a lower rate for the
Rare Event. We want fewer spills per year. But
with the g chart, we want more days between
spills. Notice in this case that we are not
getting a signal for this process as we did
with the XmR chart. We are seeing a steady decline,
but we need one more decreasing point for a
signal.
Again there are examples of the g chart with
other data on our live.statit.com
site under the Rare Event Webinar.
The g chart is simple to implement. Your data
can be in the form of dates or datetime that
an event took place. Or it can be a column of
days or time between individual events. Or a
column of the counts of events between adverse
events.
The g chart also has the many options available
to other Statit SPC charts including phases,
reference lines, Assignable Cause points, etc.
G charts only need one chart to analyze the
process. Generally with the g chart, the lower
control limit does not come into play and is
bounded by zero.
Notice on this chart that I have displayed
the number of days since the last spill. This
number is an indication of how the process is
behaving since the last spill. It would be important
to know how we are doing in relation to that
last event but until we get the next spill,
we cannot calculate the number of days between.
However, I have color-coded the subtitle such
that it is red if the days since the last spill
is less than the mean, yellow if it is between
the mean and the Upper Control Limit, and green
if it is above the Upper Control Limit.
|
XmR on Rate
|
g Chart
|
|
Double Charts
|
Single Chart
|
|
Down is good
|
Up is Good
|
|
Data Manipulation
|
Ease of Use (Simple
Data)
|
So for a g chart, higher is better. We only
need one chart because one parameter describes
the distribution. The data is simple, but the
Lower Control Limit has little use on the g
chart. So to compare the two alternatives, we
see that major differences are in the interpretation
and the simplicity of the data. Lower is better
on XmR and higher is better on g chart. XmR
may require data calculation and manipulation.
G chart is simple to implement with simpler
data. XmR chart may require analyzing two charts.
G chart does not give us much information with
the Lower Control Limit.
Now, suppose we want this information to be
displayed to the organization. With Statit e-QC,
you have the ability to schedule reports on
a frequent basis to give you up-to-date information.
And, you can access these reports from your
intranet. So you could publish a scheduled report
as shown on live.statit.com
and link it to your intranet so that everyone
knows how you are doing currently and how that
compares to the past and whether or not you
are getting better. This, of course, give a
better indicator than just a simple sign.
Of course, there are other alternatives such
as transformations or Cusum charts. But I find
that these charts are easier to interpret for
most users.
References:
Benneyan, J.C. (2001). Number-Between
g-Type Statistical Quality Control Charts for
Monitoring Adverse Events. Healthcare Management
Science, 4.
Wheeler, D.J. (2004). Advanced topics
in statistical control The power of Shewharts
charts (2nd ed.). Knoxville, TN: SPC Press.