Rational Subgrouping


Choosing the correct subgrouping scheme is critical to the proper analysis of a Shewhart chart. An improper subgrouping rationale can hide process changes or indicate process changes where in actuality none exist. The wrong subgrouping scheme can render a chart useless or worse.

Control charts of the type where subgrouping is used (Xbar, R, S, Median, p) can give erroneous or misleading results if the method of subgrouping has not been given a lot of thought or if the subgrouping scheme is not understood by the analyst.

For each of these charts, the idea is to subgroup so that the units measured in each subgroup are likely to be homogenous and the subgroups have a higher probability of being unlike. Homogenous means that the probability is high that the measurements will be near the same because they are drawn from the same population.

Sources of Variation
In order to choose the correct subgrouping scheme we need to understand the sources of variation. There are several sources of variation in manufactured product. The first is lot-to-lot variation. Certainly we would like to minimize the variation between lots. Lots are usually manufactured as separate units and as such they are likely to have some differences in manufacturing. Minimizing that variation provides for a more predictable process.

The second is stream-to-stream variation, which may arise when an inspection is done where several process streams meet. This variation may also occur when we are not able to capture the information identifying which stream a product came from. For example, the inspected product could be coming from several machines. If the data contain no differentiation by machine, stream-to-stream variation may be incorporated into the control chart. Other common examples are multiple cavity molds, different operators, or different inspections.

Third is the time-to-time variation. This is the primary source of variation that we attempt to address with control charts. Is our process changing over time or is it predictably stable?

A fourth source of variation is the piece positional variation, produced by the choice of location of the measurement on the part. For example, diameter of a shaft of the electrical resistivity of a silicon wafer. The same measurement could be taken in several different locations on the part.

The fifth source is the one usually addressed with Gage R&R studies. Error of measurement has both instrument and human components that can be indicated by a Gage R&R study.

One subgrouping scheme may have more than one source of variation. Understanding of the magnitude of the variation from different sources helps to choose subgrouping schemes and to analyze Shewhart charts.

Questions of a Control Chart
The question that the Shewhart chart asks is, “Is the pattern of variation among the subgroups consistent with the averaged pattern of variation within the subgroup?”.[1] The question is answered by whether a point is outside the control limit or not. If there is a point outside the control limit the question is answered, “No”, and we conclude that there is a high probability that an assignable cause exists in the process.

The control limits of a Shewhart chart are based on the averaged variation within the subgroups. Minimizing the variation within the subgroups keeps the control limits tighter and increases the sensitivity of the chart to detect process changes between subgroups.

To properly analyze the chart, we need to understand what makes up the variation within the subgroup. If we are grouping by lot, then our variation is lot-to-lot. However, if the lot is manufactured by more then one machine, then we have to understand that machine-to-machine variation is also included in the subgroup variation.

Example
To illustrate, let’s look at an example. This example comes from Wheeler[1]. An injection molding press produces 4 parts with each cycle of the press via a mold that has 4 cavities. But there seemed to be a problem with the process. The QA professional used control charts to analyze the source of the variation.

Each product sample was collected from 5 consecutive press cycles and measured. The measurements were recorded and identified with the hour of the measurement (1-20), press cycle (A,B,C,D,E) and mold cavity (I,II,III,IV). The first chart the analyst produced follows:

(mouse over the data points for more information)

Java is not enabled in browser, data tips cannot work for this graph.

This chart is asking the questions:

1) Are there hour-to-hour detectable differences?
2) Are there cycle-to-cycle detectable differences?
3) Are the cavity-to-cavity differences consistent?

The answer to all these questions is "No". The process appears to be in control but the spread of the range chart indicates that there could be an issue with subgrouping. We would expect that the ranges would be spread more over the 3 sigma range with 2/3 in Zone C, 95% present within the Zone B boundaries . As it is, most of the points fall within Zone C. This chart has changed the color of the "trend rule" violations, but if you hover over some of the points on the Range Chart you will see several violations of the 15 points in a row in Zone C. The analyst decided to investigate further thinking perhaps we are asking the wrong questions.

The analyst then produced this chart. (mouse over the data points for more information)

Java is not enabled in browser, data tips cannot work for this graph.

As you can see by hovering over a point, each point is the statistic of a single cavity by hour and cycle. This chart is asking the questions: 1) are there detectable differences from hour-to-hour, 2) are the cycle-to-cycle differences consistent 3) are there detectable differences from cavity-to-cavity

These two figures show us several things. When we looked for a signal for changes in the process on an hour-to-hour basis and a cycle-to-cycle basis we don't see an alert. But if we look for a signal with hour-to-hour and cavity-to-cavity we see several signals. This tells us the we have some stream-to-stream variation. That is, each of the cavities (I, II, III, IV) are separate streams.

We also see that the control limits on the first Xbar chart are wider then the second Xbar chart. This illustrates that the cycle-to-cycle within-group variation is smaller then the cavity-to-cavity within group variation. A subgrouping scheme that provides for lower within-group variation provides for a better chance to detect signals of a change in process. The within-subgroup variation provides a limit on the amount of variation that is needed between subgroups to signal an alert.

The above chart gives an indication of what to do next and the analyst put together this chart: (mouse over the data points for more information)

Java is not enabled in browser, data tips cannot work for this graph.

This chart definitely indicates the difference between cavities. But it shows the long-term hour-to-hour variation as well as the short-term cavity variation. Because we are getting a number of out-of-control alerts on each of the cavities, we see that there is some long-term variation. The variation is high enough hour-to-hour to produce a signal. It also signals that one cavity, I, is consistently higher then the others.

Stream-to-stream variation is not uncommon. And there are times when it is of little consequence, but the analyst must be aware of this variation knowing that it could come into effect. In the example above, the solution was to clean the mold more thoroughly and more often. With that change in procedure, it might not be necessary to run individual charts on each stream. The analyst should at least periodically check the different sources of variation to see if they have returned to play.

This examples shows the importance of:

  • Understanding the sources of variation in your process
  • Using Shewhart charts to understand the magnitude of those sources of variation
  • Choosing subgroups that minimize variation within subgroup
  • Choosing subgroups that maximize the opportunity for variation between subgroups.

Conclusion
The key to choosing a subgrouping scheme is to understand the sources of variation that exist in your process. Shewhart charts can help you to understand the magnitude of the sources of variation. As we’ve seen, if you have the data to differentiate other causes of variation, you can produce Shewhart charts that will give you insight into the magnitude of the various sources of variation. With that knowledge you can choose the subgrouping scheme to use to monitor the process.

Choose subgroups that minimize variation within subgroups. If the items in our subgroup are similar based on time, space or product the measurements are likely to be similar.

Choose subgroups that maximize the opportunity for variation between subgroups, so that if there is a process change we will detect it. The source of variation indicated between subgroups usually is time-to-time but can also include one of more of the other sources as well. That is fine as long as it is understood by those interpreting the charts.

Please view other examples of Subgrouping on our unhosted Statit e-QC demo. The category Rational Subgrouping illustrates the concepts. You can also view a Statit Webinar on Rational Subgrouping.

References
Florac, W.A., & Carleton, A.D. (1999). Measuring the software process: Statistical process control for software process improvement. Boston: Addison-Wesley. [4]

Grant, E.L., & Leavenworth, R.S. (1996). Statistical quality control (5th ed.). Boston: McGraw-Hill. [2]

Shewhart, W.A. (1931). Economic control of quality of manufactured product. New York: D. Van Nostrand.[3]

Wheeler, D.J. (2004). Advanced topics in statistical control – The power of Shewhart’s charts (2nd ed.). Knoxville, TN: SPC Press. [1]

If you would like additional information, please send email to statit.support@acs-inc.com.