04 April 2010

Statistical Insignificance

Our thanks to Tom Siegfried for raising the issue; a simple example is needed:

For reasons known only to him, your lunch companion takes out two coins--a quarter and a nickel. He flips both coins—first the quarter, then the nickel. He repeats this five times and, in each case, if the quarter lands heads, so does the nickel; if it lands tails, the nickel also lands tails. "This," he says, "can't be coincidence; the quarter must be forcing how the nickel lands."

Being naturally skeptical, you immediately set out to discover whether this hypothesis, that the quarter forced the nickel, is tenable. You assume a null hypothesis: that there's no connection between the coins and that it happened by chance. The probability of the nickel falling the same as the quarter five times by chance turns out to be 1/32 or about 3%. If you repeated the exercise every day, you'd get a five-fold match once a month.

The usual scientific threshold for statistical significance is 5%. The USEPA uses 10% to classify a Group A carcinogen. Since smaller is more significant, a probability of 3% is safely inside those thresholds and appears to decisively reject the null hypothesis [cue the theremin]. You've found a "statistically significant result". The quarter appears to be forcing the nickel.

You might even say that the probability is 97% that the quarter is forcing the nickel. You'd be in good company.

And you'd be wrong on so many levels that it hurts my head to think about it.

Yet there has to be a 97% somewhere—and here it is: If the null hypothesis is correct—if there's no causal connection between the quarter and the nickel—the probability of at least one mismatch in five tosses is 97%.

Which didn't happen. And that's too bad. If we'd had just one mismatch, we'd have laid to rest the forcing hypothesis and that would be the end of the matter. Nonetheless, we can't say that we proved the forcing hypothesis.

As it is, all we can say is that we failed to invalidate the forcing hypothesis with a stronger argument in favor of pure chance. Without more information or more experiments, we have no idea what the probability is that the quarter forced the nickel, or not. We might say, on the strength of five matches, that for small values of smidgen, it's a smidgen more probable than it was before.


  1. Entangled coins ???


    Elias Cardenes

  2. Entangled Coins? Chuckle. In an infinite universe all things are possible.

  3. When my girls were young I developed a board-game for us where we bought casinos as in Monopoly, and also gambled on the through of two dice, for which I worked out a nice table of odds. One game, my big daughter was having a bad time, spending her early money on gambling instead of casinos, and earning salary. We played to a time - that time came Senka was almost broke. She put all her money on 'three' (2 and 1, 1 and 2)THEN double three. two throws. It came up. Combined p. .0015432. I paid her 648 times her stake. See what a good father I was. She won, but never gambled in her life. She is 42 now. Significant or what!

    By the way, do the statistical implications of an 'infinite universe', that is infinite matter, worry us? Why did I learn it was impossible Marc?

  4. That's a nice "black swan".

    Of course, I was referring to the universe of infinite possibilities, which many notable financiers failed to notice.