Central Limit Theorem
Q: My process data doesn't look very
normal when I look at the data using a histogram.
Will this affect the validity of my x-bar and
R (or S) control charts?
A: Control chart construction and theory
for x-bar, R and S charts are based on the assumption
of normality of the points being plotted, not
necessarily the underlying data. Fortunately,
the Central Limit Theorem comes into play to
make it possible to use an X-bar chart (of subgroup
averages) with data from non-normal distributions.
(But you should also have some idea of why your
data is non-normal to begin with: are there
nonrepresentative samples included? Do you have
a mixture of different processes? Is there a
limit that truncates the data artificially?
These types of situations should probably be
addressed before trying to use a control chart.)
The Central Limit Theorem states that the distribution
of subgroup means (averages) from any distribution
will approach a normal distribution as the subgroup
size increases. Improvements in normality can
be seen with subgroups as small as size 3. For
distributions that are very non-normal, the
subgroups will need to be larger than for distributions
that are fairly symmetric.
Central Limit Theorem: If X1, X2, ...,
Xn is a random sample of size n taken from a
population with mean µ and finite variance
ó2, and if
is the sample mean, then the limiting form of
the distribution of is
distributed as a standard normal distribution
as n approaches in infinity.
In statistical notation
||For distributions of any shape (normal,
skewed, uniform, etc.), if we take the means
of subsamples of size n and then standardize,
we get an approximately normal distribution
with a mean equal to 0 and a standard deviation
equal to one, as n gets larger.
In addition, as the sample size, n, gets larger,
the standard deviation of the means gets smaller
The example below shows data that was simulated
using Statit. The data is distributed according
to a uniform distribution with a range of 1
to 5. A histogram of the data is shown below.
This data does not fit the superimposed normal
The next histogram shows the distribution of
the means of the data when taken as subgroups
of size 3. Even with this small subgroup size,
significant improvement of the normality can
be seen when viewing the subgroups means.
Increased normality can be seen when viewing
the means of subgroups of size 5 and 7, as shown
in the histograms below. As the subgroup size
increases, the variability of the distribution
of the means decreases.