Normal Probability Plot Interpretation


Q: How do I use a normal probability plot to assess the normality of a population?

A: Probability plotting is a graphical method for determining whether sample data conform to a hypothesized distribution, based on a subjective visual examination of the data. In process management we are typically concerned about whether the data is distributed according to a normal distribution, since many of the statistical inference procedures that we use require the assumption of normality of the data.

To construct a probability plot:

1. The observations are ranked from smallest to largest, x(1), x(2), . . ., x(n).
2. The ordered observations x(j) are plotted against their observed cumulative frequency, typically; (j/(n + 1))on a graph with the y-axis appropriately scaled for the hypothesized distribution.
3. If the hypothesized distribution adequately describes the data, the plotted points fall approximately along a straight line. If the plotted points deviate significantly from the straight line, especially at the ends, then the hypothesized distribution is not appropriate.
4. In assessing the "closeness" of the points to a straight line, the "fat pencil" test is often used. If the points are all covered by the imaginary pencil, then the hypothesized distribution is likely to be appropriate.

Example: The following data represents the thickness of plastic sheet, in microns:

43, 52, 55, 47, 47, 49, 53, 56, 48, 48

Ordered data
Rank order
(j)
Cumulative Frequency
( j/(n + 1))
43
1
1/11 = .0909
47
2
2/11 = .1818
47
3
3/11 = .2727
48
4
4/11 = .3636
48
5
5/11 = .4545
49
6
6/11 = .5454
52
7
7/11 = .6363
53
8
8/11 = .7272
55
9
9/11 = .8181
56
10
10/11 = .9090

The ordered data is then plotted against its respective cumulative frequency. Note how the y-axis is scaled so that a straight line will result for normal data.

Based on the normal probability plot and using the results of the "fat pencil" criteria, it appears that the thickness data is normally distributed. Thus, using further statistical tests that require the assumption of normality is appropriate. Statistical tests based on the t-distribution and the F-distribution are fairly robust to minor departures from normality, so a subjective visual examination of the probability plot is usually sufficient to use these tests with confidence.

Statistical Test for Normality

Statit can also perform a Shapiro-Wilk hypothesis test on the normality of the data for sample sizes in the range [10,1000]. The null hypothesis is that the data comes from a normal distribution:

H0: Population is normal

If the p-value is smaller than the critical value, usually 0.05, H0 is rejected and we conclude that the population is not normal. In the above case, the p-value for the test of normality is 0.246 so we do not reject H0 and we accept that the underlying population is normal. This is the same conclusion we reached using the "fat pencil" test on the probability plot.

Advantages of Probability Plots

  •  

Normal probability plots work well as a quick check on normality.

  •  

Probability plots work well for both large and small samples, as opposed to other statistical tests which have more limited ranges of sample sizes. For example, Shapiro-Wilk can usually only be used for sample sizes in the range [10,1000], while goodness of fit tests, such as the Chi-Square test, usually require at least 50 – 100 observations for meaningful tests.

  •  
Probability plots help us investigate the normality of residuals from regression or ANOVA models. Residuals are not independent of each other since they are calculated from the underlying model that was fit to the data. However, observations must be independent to use the other statistical tests of normality.
  •  
Probability plots can be constructed for distributions other than the normal distribution.

Disadvantages of Probability Plots

  •  
People can make different interpretations of the plots, or use fatter or thinner pencils.
  •  
Normal probability plots alone do not yield a p-value regarding the decision. The Shapiro-Wilk test must be performed to get a p-value.


How to Have Statit Construct a Normal Probability Plot

Select Graphics -> Distribution Plots -> Probability…

Variable: Click on ->, Click on the desired variable. Click on Close.

When you are ready to draw and view the graph, choose OK.

NOTE: Normal probability plots are also useful as process management tools with the addition of lines at the probabilities associated with ± 3 sigma. For more information on this, see Probability Plot Use in QC.

If you would like additional information, please send email to statit.support@acs-inc.com.