Solutions Company Statit Training Home

Summarized Data: Choosing Appropriate Charts

Circumstances sometimes arise that require the use of summarized data as input to control charts. Once data are summarized, information critical to control charting may be lost or obscured. This discussion is not intended to discourage the use of control charts for summarized data. Instead, the goal is to present issues associated with the data that must be taken into consideration in order to have confidence that the resultant control chart is, in fact, valid.

The existence of control charts recognizes the fact that the processes being monitored have inherent variability. When data are summarized, the characterization of that variability may not be available. For example, a dataset that contains an average value or a ratio without a measure of dispersion or number of observations, limits our ability to characterize the variability of the data. However, it is acceptable practice to determine normality from summarized values. In fact, summarized values may more closely approximate a normal distribution than the raw data.

Let us explore the possibilities of working with attribute data that is summarized as a ratio. If the available values are normally distributed, then an Individuals chart can be considered. The issue with this choice is that the control limits may be more conservative than those calculated from the raw data. Control limits that are more conservative than the actual control limits of the process could result in control violations occurring that are truly part of the normal variation of the process rather than an indication of the process being out of control. If the data represents the occurrence of an event, e.g. patient falls or medication errors per 1000 patient days, then the Individuals chart is a viable choice.

Another alternative is the P-chart. The P-chart does not require normally distributed data or equal subgroup sizes. If the actual subgroup size is available, the ratio can be plotted as the "data variable" and the subgroup variable used for the subgroup size. This will more accurately reflect the possible variability of the data in the control limits. If the subgroup size is not known, choose an appropriate subgroup size, e.g. 100, 1000, as representative of the ratio denominator. Depending on the actual subgroup size, the resultant P-chart may have control limits that are wider than would otherwise occur using the non-summarized data. Wider control limits may mask control violations that would occur with more realistic limits.

If the data in question are variable data, the data are normally distributed and the data are already summarized into group means, ranges and/or standard deviations, then X-Bar, R and S charts can be used. Summarized data can be used in variable charts by selecting appropriate options in the dialogue boxes. The following examples illustrate how to generate X-Bar, R and S charts using these data.

The first example generates an X-Bar and R chart. The data are summarized for each diagnosis. The data needed to produce the chart are the average PBD values, range values, subgroup designations and subgroup sizes. The dialogue to produce the chart begins the same as a chart using raw data. Within the X-bar chart dialogue, select Avg_PBD as the data variable, choose Subgroup Size using a variable and select Cases. The important variation in this chart is to select the Summarized Data button. Check the box labeled Data are summarized and provide the variable containing the ranges or standard deviations. In this case, the subgroup sizes are reasonably small, so we are going to use the ranges in the PBD_Ranges variable. This dialog is shown in Figure 1.

Figure 1

The control chart is displayed in Figure 2. Since the subgroup sizes are not constant, the control limits are not constant. Because of this, adding the values of the control limits to the chart do not provide helpful information. A more useful alternative is to add variables to the data tips. In this example, it is advantageous to add the range, subgroup size (n), upper control limit and lower control limit to the default data tips for each point.

Figure 2

The accompanying R chart for this data requires that the user check the Data are summarized box under the Summarized Data button. An example of this dialogue can be seen in Figure 3. It is helpful to include the subgroup size and control limits in the data tips for this chart as well.

Figure 3

The resultant R chart is displayed in Figure 4.

Figure 4

The data set in the following example has larger subgroup sizes. Instead of a variable with the subgroup ranges, this dataset contains the subgroup standard deviation as shown in Figure 5. When entering the variable, there is no option to specify the type of value being passed to the control chart.

Figure 5

By default, control limits for X-bar charts are calculated using subgroup range values. However, the use of range values is valid as long as the subgroup sizes are between 2 and 30, inclusive. This dataset uses the Regional_Cases variable for subgroup sizes. These values are well beyond the allowable subgroup size. Attempting to produce an X-bar with the current selections would result in an error as shown in the example in Figure 6. Note that this is basically the same dialogue that was used to generate the X-bar chart in Figure 3, even though this data set has much larger subgroup sizes and sample standard deviation values instead of the range values. The software recognizes that the subgroup sizes are not compatible with the choices that have been made up to this point. It is, therefore, necessary to specify that the control limits be calculated using the subgroup standard deviations.

Figure 6

The flag to specify that the control limits are to be calculated based on standard deviations instead of ranges is found in the dialogue in the Control Limits button. The dialogue is shown in Figure 7.

Figure 7

As discussed in the previous series of charts, the control limits vary. The subgroup sizes and control limits are added to the plotted data tips. The X-bar chart is displayed in Figure 8.

Figure 8

Generating the S chart uses similar choices as the R chart. The user specifies the variable containing the standard deviation values and checks the summarized data box under the Summarized Data button as illustrated in Figure 9.

Figure 9

The selection of additional data tip variables produce the final S Chart in Figure 10.

Figure 10

It is important to have as much knowledge about your data as possible. When confronted with summarized data, investigate the possibility of getting more detail, such as subgroup size. Identify what is known and not known about the data and how the charts could be affected. Finally, formulate a realistic idea about what can be expected from the chart. If the results differ from expectations, consider a different strategy or re-evaluate the validity of the data.