Solutions Company Statit Training Home

The Shewhart p Chart for Comparisons

Robert F. Hart, Ph.D.
Marilyn K. Hart, Ph.D.

An important question regarding test result validity is: "Given a positive test result, what is the probability that this test result is wrong?" This question is answered by the positive predictive error rate, the number of false positive test results divided by the total number of positive test results. The case study here comes from Statistical Process Control for Health Care by Marilyn and Robert Hart, Duxbury, 2002. Table 1 gives data for evaluating the positive predictive ultrasound error rate for seven radiologists in their ultrasound testing for acute appendicitis. The first three radiologists are radiology specialists; the last four are non-specialists. Table 2 summarizes these results for the two groups.

Table 1. Acute Appendicitis: Positive Predictive Errors by Radiologist

 Radiologist False Positives Total Positives 1 1 8 2 4 12 3 2 8 4 3 5 5 5 7 6 2 6 7 2 4

Table 2. Acute Appendicitis: Positive Predictive Errors by Radiologist Group

 Radiologist Group False Positives Total Positives Specialists 7 28 Non- specialists 12 22

One might make a p chart with seven subgroups to compare the radiologists and/or with two subgroups to compare the groups. It is common for time-ordered control charts to have 25 subgroups, in which case the common 3-sigma limits are appropriate. However, for fewer subgroups, 3-sigma limits are too wide to be effective. Recommended values of T for T-sigma limits are given in Table 3.

Table 3. Process Evaluation for Special-cause Variation: Recommended Values of T for T-sigma Limits.*

 # of subgroups T 2 1.5 3-4 2.0 5-9 2.5 10-34 3.0 35-199 3.5 200-1500 4.0

* The tabular values of T may be used for the usual case of "no standard given" with all attribute and variables charts.

A common problem with p charts is that the subgroup sizes are too small to give valid results. The required minimum subgroup size is a function of pBar. For a point to be accepted as valid, a minimum subgroup size of 1/pBar is needed and for a point above the upper control limit to be accepted as valid, a minimum subgroup size of 4/pBar is needed. Here the pBar for the error rate is 0.38 (whether one subgroups the data by radiologist or by radiologist group).

The p chart subgrouped by radiologist (not shown here) shows that the 7 points all fall below the 2.5-sigma upper control limit with the subgroup sizes all adequate (i.e., exceeding the minimum requirement of 1/0.38 = 2.63 rounded up to three). With only this analysis one would be conclude that special-cause variation between radiologists was NOT DEMONSTRATED. This does not mean that no special-cause variation existed -- only that a more powerful method of subgrouping was needed to flush out the lack of statistical control.

Figure 1 is a p chart with 1.5-sigma limits profiling the two radiologist groups on their positive predictive error rate. This chart illuminates the superior performance of the specialists, which should be no surprise. It should be noted that the out-of-control condition is to be accepted as valid since the minimum required subgroup size of 4/0.38 = 10.53 is met.

Figure 1. p Chart Profiling Radiologist Group on Positive Predictive Error Rate, 1.5-sigma Limits

The p chart on individual radiologists had insufficient power to detect any special-cause variation. By subgrouping the seven radiologists into specialists and non-specialists, the larger subgroup sizes provided the increased power needed to detect the special-cause variation.

Should only specialists perform the ultrasound tests?

For more information, contact Drs. Robert and Marilyn Hart at robthart@aol.com or (541)412-0425.