Histograms with Two or More Peaks
Q: When I make a histogram of my data,
it seems to have two distinct peaks or "humps."
What could be causing this?
Histogram of Bimodal Data
A: A histogram with two peaks is called
"bimodal" since it has two values
or data ranges that appear most often in the
data. In a process that is repeated over time,
we typically expect the data to appear in the
familiar, bell-shaped curve of the normal distribution.
Thus, the bimodal histogram can signal something
out of the ordinary. Histograms can also be
multi-modal and the following discussion can
be applied to these shapes, too.
A bimodal histogram shape often reflects the
presence of two different processes being "mixed"
in the displayed data. For example, the data
could contain information from two different
machines, two shifts, weekdays and weekends,
two offices, etc. Control charts based on mixed
data often have overly wide control limits relative
to the individual processes. These wide limits
can seriously decrease the ability of the chart
to signal shifts and changes in the individual
processes over time.
The best solution for mixed data is to separate
the data based on the individual processes,
and then make separate histograms and/or control
charts for each process. Process management
efforts can then be directed to each process
individually to (1) determine the cause or causes
of the differences between the processes, (2)
monitor and control each process, and (3) improve
one or both of the processes.
Alternatively, a bimodal histogram shape for
a process that can change over time could indicate
that the mean of the process has been shifted
over the period covered by the data. For example,
this could occur if the observations spanned
a time period that included a significant process
or "phase" change, such as machine
calibration, tool change, new service method,
new supplier, etc.
The best approach to analyzing data representing
the different phases is to make separate histograms
representing each phase. Control charts should
also segregate the data, so that the control
limits can be based on the data in the appropriate
phases only. The position of the center line
and/or the width of the control limits in the
individual phases is an indication of whether
there have been any phase-related changes in
process performance. Statware's family
of process management products can base control
limits on the data from the individual phases.