Solutions Company Statit Training Home
 



Turning SPC Charts into Confidence Interval Charts

Bill Farrell, Ph.D.
Senior Analyst
Sutter Health

Sutter Health is a family of not-for-profit hospitals and physician organizations that share resources and expertise to advance health care quality. Serving more than 100 communities in Northern California, Sutter Health is a regional leader in cardiac care, cancer treatment, orthopedics, obstetrics, and newborn intensive care, and is a pioneer in advanced patient safety technology.

As Sutter Health has promulgated statistical process control for tracking outcomes, some confusion has arisen regarding the interpretation of SPC charts and the kind of information that can be placed on them.

Specifically, people have wanted to see Sutter Health norms, California norms, national benchmarks, etc. drawn on SPC charts, so they can get an idea of where they stand with respect to a target. The problem with doing this is that it fosters a mentality where attention is focused on the target, rather than the process. Statistical process control concerns itself with the variability of the process, not the level of the process. When we see a significant trigger on a patient satisfaction SPC chart, that data point is significant because of the past history of the process, not because the point is above or below some norm.

Still, people have every right to know where they stand in relation to others. How can we present this information in a meaningful and statistically valid way? The answer lies in confidence intervals.

Let's say that a sample of 300 patients discharged from a hospital in 1Q 2004 gives a rating of 86 on nursing satisfaction. If we asked every patient discharged in the first quarter to rate nursing satisfaction, would we still get 86? Probably not-there is some uncertainty associated with that number. How much uncertainty? Let's say there were 1,000 patients discharged from the hospital in the second quarter. How confident would we be in our conclusions if we had sampled two patients? How confident would we be if we had sampled 993 patients?

Confidence intervals are intimately tied to sample size. The larger the sample, the narrower the confidence interval; the smaller the sample, the wider the confidence interval-all other things being equal. Statements about confidence intervals usually take the following form: "The nursing satisfaction score is 86, and the 95% confidence interval ranges from 82 to 90." We can interpret this statement as saying that we are 95% confident that the "true" nursing satisfaction score lies between 82 and 90. If we had sampled 900 patients instead of 300, the 95% confidence interval would have been narrower (85 to 87, say); and if we had sampled only 50 of the 1,000 patients, the 95% confidence interval would have been wider (maybe 76 to 96).

We hear people using confidence intervals almost every day, though we may not realize it. If a news anchor tells us "The president's approval rating is 67%, subject to a 3% margin of error," we can be 95% certain that the true rating lies between 64% and 70%.

Why do we use 95% confidence intervals, when we could just as easily calculate 90% or 99% confidence intervals? It's directly related to the p value criterion of .05 (or 5%) that researchers have used for decades. What we're saying is that we're willing to take a 1 in 20 chance of being wrong when we claim that something is statistically significant. People use different p values and confidence intervals in special situations, but .05 and 95% seem to work most of the time.

Some years ago Dr. Brent James of Intermountain Health Care developed a technique whereby an SPC chart can quickly be converted into a (pseudo) confidence interval chart. The first chart below is a standard three-sigma SPC chart showing fictitious patient satisfaction scores for St. Elsewhere Hospital.

From the alphabet soup of significant triggers, it's clear that the folks at St. Elsewhere have been doing something right over the last 4-5 years. Looking at the centerline of the chart, however, we note that it's around 68, which seems a little low.

Suppose we know that the national norm for patient satisfaction is 75. With two simple tricks we can turn this three-sigma SPC chart into a 95% confidence interval chart. First, we change the "number of sigmas" from the default of three to two. Second, we "fix" the centerline at 75, rather than letting it default to the mean of (around) 68. The chart below shows the result.

These two charts look a little different because of different y-axis scales, but they are plotting exactly the same data (note, for example, that the first data point is just under 65). The "rule letters" have been turned off in the second chart, since they have no bearing in its interpretation. We interpret this chart as follows: if a data point is inside the two-sigma envelope, it is not different (statistically) from the target. If it lies outside the envelope, it is significantly above (or below) the target.

I made up this example deliberately to show how two very different conclusions could be drawn from the same set of data. The SPC chart shows significant positive progress; the CI chart shows five years of significant negative performance. Looking at the SPC chart alone, leadership at St. Elsewhere might have been content to rest on their laurels. Looking at the CI chart, they realize that given their very low starting point, they still have some work to do.

Patterns like this have shown up at Sutter Health, but so have many others. There are several cases where a process has been completely under control for 6-7 years, with every data point significantly favorable to target. The variations are endless, and looking at our data this way has been eye opening.

Technical Note: I called this a pseudo confidence interval chart because, with the exception of p charts, two sigma is not identical to 2 (1.96) standard errors. It is very close, however, and I should point out that we're using these CI charts at Sutter Health for directional guidance, not precision reporting or inferential statistics.