“Five sigma,” is the gold standard for statistical significance in physics for particle discovery. When the New Scientist reported about the putative confirmation of the Higgs boson, they wrote:
‘Five-sigma corresponds to a p-value, or probability, of 3×10-7, or about 1 in 3.5 million. There’s a 5-in-10 million chance that the Higgs is a fluke.’
Does that mean that p-values can tell us the probability of being correct about our hypotheses? Can we use p-values to decide about the truth (correctness) of hypotheses? Does p<0.05 mean that there is a smaller than 5 % chance that an experimental hypothesis is wrong?
Of course not, and again, the lay press got it right (click here for another example). Here is how the Wall Street Journal put it:
“That is not the probability that the Higgs boson doesn’t exist. It is, rather, the inverse: If the particle doesn’t exist, one in 3.5 million is the chance an experiment just like the one announced this week would nevertheless come up with a result appearing to confirm it does exist. ”
A more detailed treatment of the public (mis)perception of this issue as it relates to the confirmation of the existence of the Higgs, see the post of the Winton programme for the public understanding of risk. They run an excellent blog on all matters of statistics and uncertainty.
The point here is that the probability expressed in p-values relates to the data, but the common misinterpretation is to apply it to the explanation (= the hypothesis)! If a cable is loose, even a p-value of one order of magnitude smaller than 5 sigma will not get you closer to the truth of your 100 million $ project (see previous post)! Let’s hope all cables were well connected when CERN did the experiments that lead up to the 2013 Nobel prize for Higgs and Englert.
But how could anyone find whether the CERN experiment was conclusive? Here is another lesson from particle physics: Most experiments, and that includes many in neuroscience, have become so complex (and reporting about them so uninformative, but this is another issue, see a previous entry) that any review process is doomed in principle. This was put very nicely in Michael Nielsen’s blog (2011). Note that he wrote this before the paper on the confirmation of the Higgs was published, and the Nobel prize rewarded:
‘No one person in the world will understand in detail the entire chain of evidence that led to the discovery of the Higgs. In fact, it’s possible that very few (no?) people will understand in much depth even just the principles behind the chain of evidence. How many people have truly mastered quantum field theory, statistical inference, detector physics, and distributed computing? What, then, should we make of any paper announcing that the Higgs boson has been found?’
‘It seems to me that one of the core questions the scientific community will wrestle with over the next few decades is what principles and practices we use to judge whether or not a conclusion drawn from a large body of networked knowledge is correct? To put it another way, how can we ensure that we reliably come to correct conclusions, despite the fact that some of our evidence or reasoning is almost certainly wrong?’