Text Size
Wednesday, January 16, 2019
Six Sigma Heretic How Do You Measure Up?

Six Sigma

Six Sigma Solutions

Business Performance Excellence

Maximizing your profitability

Experimental Design

Solving the unsolvable in industry

On-site Training

On-site Training

Measurement Level Part 1

There is something that will get you into trouble if you don’t know about it, and which in my experience very few Black Belts or even Master Black Belts know about. It’s only in the mind of the person doing the research, and no software program will know about it. This mysterious property is “measurement level,” and that is what I want to talk about this month.

What are measurement levels?

For the technoid-minded, there’s a great article* by Warren Sarle that explains this in more detail, and because measurement levels affect anyone who uses data to make decisions I thought a less technical version might be useful.

Let’s go back to basics for a minute. What are data?

Well, that’s a profound question if you think about it. Something happens, and we want to describe it. Perhaps our eyeballs and brains translate electromagnetic waves into an idea of “red.” Perhaps a micrometer translates a thickness into centimeters. Something needs to take what happened and convert it into something we can describe. This process is measurement. A numerical output from measurement is a datum.

Data aren’t events, and in fact can be pretty far removed from the events themselves. A micrometer measurement in centimeters is probably pretty closely related to what you are interested in—thickness. On the other hand, “red” to one person might be “orange-red” to another, or even be indistinguishable from green to a color-blind person. What is important is the relationship of the data and measurement back to the thing you are really interested in.

You have probably heard the terms “discrete” and “continuous” data, but knowing whether the data are discrete or continuous doesn’t tell you the level of measurement.

So who cares?

You should, for one. The level of measurement affects what you can do with the data, including transformations; what statistics you can calculate; and what statistical procedures you can use.

Transformations can be anything from taking the natural log of the data to make it normally distributed to recoding data for easier data entry. A classic example of this is when your local meteorologist says, “Yesterday it was 35°, and today it’s 70°—twice as hot.” (This can happen in Colorado, where I live.) Here the transformation—multiplying by two—isn’t correct because of the relationship of the data (degrees Fahrenheit) to the underlying concept (temperature). Multiplying by two doesn’t preserve the actual relationship between 35° and 70°.

So to understand the level of measurement, we have to know what we’re really interested in, and how we’re generating the data.

Nominal data

If the data we’re generating is just a name for the characteristic, it’s nominal data. For example, if I’m quantifying the different religions in a voting group, I might choose the number “1” to be Jewish or Catholic or Muslim or anything. It doesn’t matter what number I assign to what religion; the number is just a convenient name. ZIP codes are another example of this. The ZIP code 80501 isn’t more than 11102; 80501 and 11102 aren’t related in any way.

With nominal data, you can use frequencies to describe the results, determine the mode(s) (e.g., the most frequent religious affiliation), but you can’t calculate a mean or a standard deviation. The “average religious affiliation” doesn’t even make sense. What’s the average ZIP code? You could have assigned any religion to any number, and so you would end up with a different “average” of whatever data you collected. You can use tests like χ² and tests based on the binomial distribution to answer your research questions, but most other statistical procedures are inappropriate.

Ordinal data

If there’s something in the order of the data, but the difference between, say, a 1 and a 2 may be different than the difference between a 2 and a 3, you have ordinal data. An example of this might be ratings or rankings. If you go to the doctor and she asks you how much it hurts on a scale of 1 to 10, the number you choose is ordinal. The difference between a 3 and a 4 isn’t necessarily the same as between a 9 and a 10. If we were to graph the relationship, we might find something like this:

Figure 1. Ordinal level measurement’s relationship to the underlying property


The curved line is the actual relationship between whatever you’re interested in and the numbers that you get from your measurement system. This shows how the difference between a 2 or a 3 measurement is smaller than the difference between a 9 and a 10. If I get a 2 on my measurement device, I draw a line from the 2 to the relationship line, then follow that point down to the x-axis, I get the “real” number for the underlying property in which I’m interested. Of course, in real situations we either don’t know or can’t measure the true underlying property (e.g., the “true” pain at the doctor’s), but we can tell how the measurement is related to that property.

With ordinal data, you can take a mode or a median to describe the location of the data, you can do any order-preserving transformation, and you can use many nonparametric statistical procedures to determine differences, such as the sign test for location or the Mann-Whitney U test. You cannot use something like the t-test reliably, however. I have seen many people do this, especially in survey analysis, and you can very well come to the wrong conclusion if you do so.

Interval data

Data are interval scale when the differences between any two numbers correspond to proportional differences in the underlying property. Our temperature example is of interval level. Height above sea-level would be another one. If you were to graph the relationship, it might come out like this:

Figure 2. Interval-level measurement’s relationship to the underlying property


You’ll notice that the line doesn’t go through the 0,0 origin point, and that’s characteristic of interval data. Here you can see that an interval, for example, the difference between a 6 and an 8, results in the same difference in the underlying property as for a different equal interval, say between 12 and 14.

With interval-level data, you can calculate means and standard deviations (though if the data are non-normal, the standard deviation doesn’t tell you much) and use many statistical procedures, like the t-test, as long as their assumptions are met. You can do any linear transformation on these data as well.

Ratio data

Ratio-level data are proportional to the underlying property, and when there’s an absence of that property the number is zero. An example is temperature as measured on an absolute scale (e.g., Kelvin), or a thickness as measured by a micrometer. Graphing the relationship would look like this:

Figure 3. Ratio-level measurement’s relationship to the underlying property


Here you can see that, in addition to equal intervals