Solutions Company Statit Training Home
 Control Charts for Skewed Time-Based Data Q: I would like to monitor the duration of time that it takes to process a customer service call. However, a histogram of my data reveals that there are often some calls which take much longer than most of the other calls. Will this hurt my control charts? If so, what else can I do? A: Very often, time-related events are somewhat problematical since they have a minimum at zero and are probably clustered around a particular time, but have some valid data points with very long service times. Since this type of data is skewed and non-normal, conventional control charts probably should not be used on the raw data. In particular, an individuals (X) chart should not be used when there are serious departures from normality. Histogram of Original (Skewed) Data Probability Plot of Original (Skewed) Data However, skewed data can be transformed to make it more nearly normal. You can check for normality using the normal probability plot and looking for a nearly-straight line. A common transformation for highly right-skewed data is the fourth root of the original data: In Statit, you can easily create a transformed variable using the Compute/Transform command to create a new variable which is equal to the fourth root of the original variable. This is equal to the number raised to the one-fourth power, i.e., x**0.25. After transforming, you should check for normality again. If the data appears normal, it would then be proper to make an individuals chart or -chart on the transformed data. This should give you some information as to whether service times are "in control." You can "back transform" the values of the center line and control limits into real times again, if desired, to aid in the practical interpretation of these values. Histogram of Transformed Data Probability Plot of Transformed Data If the 4th root transformation doesn’t work well to handle the non-normality, you may need to try other transformations, such as cube root, square root, fifth root, or log, depending on the particular data. Finding the "best" transformation for a data set often takes a bit of experimentation.