Text Size
Saturday, February 16, 2019

# What's Up, Post-hoc?

## If there is a difference, what is it?

Last month, I described a simple problem to determine which gear material resulted in longer wear. We reviewed the extremely powerful technique called Analysis of Variance (ANOVA) and found a statistically significant and important effect on the average wear due to gear material. I also promised to tell you about an infrequently used next step that would make you a lot of money.  That is what we are going to talk about this month.

(In the event that you missed last month’s column, due to being hit by a meteor or some other equally valid excuse, click here for the research description and data.)

We showed with our initial ANOVA that there was a significant difference in wear for the five different materials, and using the statistic ω², we showed that the differences were large enough to be considered important as compared to the total variation. But the ANOVA only tells us that there’s a difference somewhere, it doesn’t tell us where the differences are.

Sometimes that may be enough. Let’s say that all the different materials we’re testing cost the same for raw material, processing, and use. If higher is better, then we’re assured that taking the material with the highest average is significantly different than at least one of the other materials, so we would choose to do that.

But, what if the total costs aren’t the same? Although material 4 is the highest on average, if it cost twice as much as material 2, and material 4 and material 2 were both statistically indistinguishable, maybe we can use material 2 instead and capture the same benefit while reducing costs.

Let’s extend our example and say that the remediation cost of a unit is \$1. This remediation cost is where the specification and the Taguchi loss function (TLF) intersect (click here to read my article discussing the TLF) and includes cost of rework/scrap/customer complaints and returns and warranty costs. Our lower specification limit is 38, which we have trouble meeting consistently right now (our Cpk= 0.629). We don’t have an upper specification, so no cost is incurred in that direction, but it would be ideal to achieve an average of 52 if it is possible—that would put us above all our competitors. We make 200,000 units per month. Material 5 is the current material and materials 1 through 4 are the ones we’re considering changing over to. They’re provided by four different vendors and some have different purchase prices:

 Material 1 \$0.0375/unit Material 2 \$0.0625/unit Material 3 \$0.0625/unit Material 4 \$0.0729/unit Material 5 \$0.0417/unit

To make a decision about which one or ones are most economical, we need to have a way to determine if certain settings can be shown to be statistically different. Fortunately, there are a number of ways to do this type of post hoc analysis. Which one you choose depends on whether the variances for the different settings are equal, and on your tolerance for Type I error. (Remember that Type I error is when you conclude that there is a difference when in fact there is not.)

All of the post-hoc tests are based on the -test, but with controls for the Type I error inflation I mentioned last month.

You might be wondering why in the world we didn’t just do these rather than an ANOVA. Well, as is typical in statistics (and in real life) there ain’t no such thing as a free lunch. In controlling for Type I error we lose power—the ability to detect a change if it’s there. So we do the ANOVA first, which is a high-power test to see if there’s a change, and then come back with a post-hoc test to determine where the differences might be. (This means that sometimes you can have a situation where your ANOVA indicates a significant difference, but none of the post-hocs do. In this case, you can either gather more data or just conclude that at this point we only know for sure that the extremes are different.)

The key to performing these tests correctly is to do all the tests you need to do, but not one more. To figure out what we want to test, let’s take a look at the data grouped by material.

 Figure 1 - Box and Whisker Plot of Experimental Results

The ANOVA indicated significant differences, so we know for sure that material 5 (our current material) is at least different than material 4. But because material 4 is the most expensive of the options as well, we don’t want to jump right on that unless we can show a good cost basis for it.

Now I have to give away what is going to happen next month, because to choose the right post hoc test I need to know if the materials all have the same variability. We’re going to find that the dispersions for the materials aren’t all the same. Now forget I said that.

We’ll use the Games-Howell procedure to control Type I inflation with unequal dispersion. This tests each pair of settings to determine if they’re different. The table below sums up what we find. For example, materials 1 and 2 are said to be equal at a p-value of 0.103.

Table 1 - Output from PHAST-TM

 Group Mean Material 1 Material 2 Material 3 Material 4 Material 1 41.625 Material 2 45 Equal (0.103) Material 3 44.5 Unequal (0.021) Equal (0.99) Material 4 52.25 Unequal (0) Unequal (0.001) Unequal (0) Material 5 40.125 Equal (0.341) Unequal (0.011) Unequal (0) Unequal (0)

Another way to sum this up is to show a table of homogenous subsets:

 Material S1 S2 S3 S4 5 40.125 1 41.625 41.625 3 44.5 44.5 2 45 45 4 52.25

Or, using parameter notation: μ5= μ1; μ1 = μ2; μ2 = μ35, μ1, μ3, μ2 < μ4. So material 4 is definitively the highest in wear resistance, and materials 2 and 3 are also higher on average than our current gear material. Material 1, although cheaper, doesn’t show any difference from our current gear.

But this still doesn’t answer which is the most economical material to select. To answer that, we have to know if there are significant differences in variation amongst the materials. It would be terrible to choose a material that has higher strength on average, but has so much variability that we end up failing a large percentage. Remember, the TLF shows us that we lose money due to variation even within the specification.

Will we end up choosing the high-wear material that is more expensive, or will it be more economical to choose one of the materials that are higher than our current material, but not quite as high as material 4? Or how about material 1? Our procurement department likes that one because it’s cheaper than anything else. Maybe it’s the most economical way to go. And will material 5 finally admit to material 1 that it’s pregnant with material 3’s baby?

Tune in next month for the next issue of “As the Post-Hoc Turns.” I bet you can hardly wait!

But I could be wrong.

## Six Sigma Heretic Article

These articles were originally published in Quality Digest, an online magazine. Subscribe to Quality Digest if you would like to receive these articles when they are published, or subscribe to our RSS feed.