Solutions Company Statit Training Home

Central Limit Theorem

Q: My process data doesn't look very normal when I look at the data using a histogram. Will this affect the validity of my x-bar and R (or S) control charts?

A: Control chart construction and theory for x-bar, R and S charts are based on the assumption of normality of the points being plotted, not necessarily the underlying data. Fortunately, the Central Limit Theorem comes into play to make it possible to use an X-bar chart (of subgroup averages) with data from non-normal distributions. (But you should also have some idea of why your data is non-normal to begin with: are there nonrepresentative samples included? Do you have a mixture of different processes? Is there a limit that truncates the data artificially? These types of situations should probably be addressed before trying to use a control chart.)

The Central Limit Theorem states that the distribution of subgroup means (averages) from any distribution will approach a normal distribution as the subgroup size increases. Improvements in normality can be seen with subgroups as small as size 3. For distributions that are very non-normal, the subgroups will need to be larger than for distributions that are fairly symmetric.

Central Limit Theorem: If X1, X2, ..., Xn is a random sample of size n taken from a population with mean and finite variance 2, and if is the sample mean, then the limiting form of the distribution of is distributed as a standard normal distribution as n approaches in infinity.

In statistical notation
In English
For distributions of any shape (normal, skewed, uniform, etc.), if we take the means of subsamples of size n and then standardize, we get an approximately normal distribution with a mean equal to 0 and a standard deviation equal to one, as n gets larger.

In addition, as the sample size, n, gets larger, the standard deviation of the means gets smaller since .

The example below shows data that was simulated using Statit. The data is distributed according to a uniform distribution with a range of 1 to 5. A histogram of the data is shown below. This data does not fit the superimposed normal distribution.

The next histogram shows the distribution of the means of the data when taken as subgroups of size 3. Even with this small subgroup size, significant improvement of the normality can be seen when viewing the subgroups means.

Increased normality can be seen when viewing the means of subgroups of size 5 and 7, as shown in the histograms below. As the subgroup size increases, the variability of the distribution of the means decreases.