Damn! What an effort: Generation of a knockout mouse line, back crossing in background strain and litermates, all the genotyping. Followed by a plethora of experiments in a disease model: surgery, magnetic resonance imaging, histology, behavioral studies, and so on. Finally the result: No phenotype! The knockout mouse appears to be a mouse like any other. Not different from the wild type background strain. But wait, we rather need to phrase it like this: We did not find a statistically significant difference between knockout and wild type. So we cannot even conclude that wild type are like knowout mice, but rather: If there is a difference, it might be smaller than the detectable effect size, depended on sample size, error level (alpha and beta) and the variance of our results. But we had planned our experiments well: The sample size was determined a priori, and chosen so that we would have been able to detect a difference on the order of one standard deviation. This is what statisticians call a Cohen’s d of 1, which is considered a substantial effect. We could not have done more animals than the (34!), because of limited ressources, the duration of the PhD thesis, and the timing of the grant. But what now? Write a paper? Reporting a NULL result? How would this look like in a resume, besides, who cares about NULL results, and which reputable journal would publish them at all? Continue reading
A study in this weeks Nature (Vrselja et al. ) has created an immediate media frenzy. Nature puts it like this: ‘Pig brains kept alive outside body for hours after death’ and ‘Revival of disembodied organs raises slew of ethical and legal questions about the nature of death and consciousness.’ The New York Times: ‘In a study that raises profound questions about the line between life and death, researchers have restored some cellular activity to brains removed from slaughtered pigs.; STAT: ‘The pigs were dead. But four hours later, scientists restored cellular functions in their brains’ etc.
That sounds spectacular. But if one reads the study (and the commentaries) is easy to spot that there are two main deficiencies: 1) The study lacks novelty, and 2) The assertion that it presents a relevant step towards restoring brain function after a prolonged interruption of cerebral blood flow is not only exaggerated, but simply wrong. Continue reading
‘Unfortunately, we have to inform you that after thorough review [YOUR FAVORITE FUNDING ORGANISATION] must reject your application’. Most of us know this sentence all to well, as most rejection letters of our grant applications contain it in a similar form. From a purely statistical point of view, we receive such letters quite frequently. In German biomedicine, the funding rates are between 5 and 25 %, depending on funder and program. Upon receiving a rejection we often feel personally offended. After all, we have put down our best ideas, often had already included some preliminary results and proposed experiments we had already conducted, even beautified the document with a lot of prose, and flattered the most important potential reviewers with strategically placed quotations, etc. And then the rejection! So we had to start over from the beginning, rewrite everything, submit it again, perhaps to another funding agency. This is how we spend a substantial fraction of our days at the office, if we don’t review applications of our colleagues. On average, scientists spend 40% of their time writing or reviewing applications. Continue reading
Triangulation! The Egyptians used it to build their pyramids. The Greeks developed a branch of mathematics out of it. Until the 19th century whole countries were charted in this way. Far into the 20th century ships have determined their position with it. To determine your position by triangulation you only need a set square and a protractor, which the surveyors call a theodolite, as well as the coordinates of two visible landmarks. It’s that simple!
Could it be that triangulation is also an important methodological approach in biology? A cure even for the replication crisis? Munafo and Smith recently postulated this in a commentary in Nature. Sociologists call it triangulation when they use two or more different methods to investigate one particular research question. If the results converge at one point, i.e. lead to the same result, this increases validity and credibility. Don’t we do this routinely in the experimental life sciences? Does the knock-out mouse have the same phenotype as one in which the signalling pathway was pharmacologically blocked? Do transcript and protein expression correlate with the phenotype?
Thus, basic biomedical research is familiar with ‘targeting’ a goal with different methods grounded in already established knowledge (the landmarks of the surveyor!). Are the results converging? Bingo, we have located the biological mechanism! Therefore it leaves many of us cold, if spoilsports with gradschool statistics argue that most studies in biomedicine must be false positive despite significant p-value. Because we don’t just rely on ONE result. Instead we triangulate by means of different approaches! In order to validate results, this might even be superior to replication. If something is simply repeated, it is not unlikely that a systematic error will be repeated too. This would make the result reproducible, but still not correct.
Were the skeptics wrong when calling out a crisis in biomedical research? Are we already doing the right thing? Continue reading
An article entitled “Growth in a Time of Debt” was published in 2010 by the highly respected Harvard economists Carmen Reinhart and Kenneth Rogoff. It dealt with the relationship between national economic growth and national debt. They reported on their discovery of an astonishing, globally observable correlation: As national debt rises, the economic growth of a nation initally also rises. If, however, the national debt exceeds 90 %, this ratio is reversed quite abruptly. Growth turns into contraction, and economic output then declines as debt rises further. The discovery of a “90 % debt threshold” hit like a bomb. Some suspect that the article was the basis for the European austerity policy after the 2008 financial crisis. What is certain, however, is that the paper was enthusiastically used by Western politicians to justify their restrictive fiscal policy. In 2013, Thomas Herndon, a student, reanalyzed the data of the Reinhart-Rogoff paper as part of a semester assignment. After some back and forth, the authors had given him the original Excel spreadsheet. And lo and behold, in a few minutes he found a number of serious errors in it! After correction, the debt threshold disappeared, and the data now appeared to prove the opposite, a steady, positive correlation between government debt and growth across the entire range! What do we learn from this? Apart from the fact that the fundamental error of Reinhart and Rogoff is of course the confusion of correlation with causation: Excel is not suitable for the analysis of complex scientific data. Even more importantly, scientists make mistakes, which can have serious consequences. Continue reading
With a half-page article written about him and his study, an Israeli radiologist unknown until then made it into the New York Times (NYT 2009). Dr. Yehonatan Turner presented computer-tomographic scans (CTs) to radiologists and asked them to make a diagnosis. The catch: Along with the CT a current portrait photograph of the patient was presented to the physicians. Remember, radiologists very often do not see their patients, they make their diagnosis in a dark room staring at a screen. Dr. Turner in his study used a smart cross-over design: He first showed the CT together with a portrait photograph of the patient to one group of radiologists. Three months later the same group had to make a diagnosis using the same CT, but without the photo. Another group of radiologists were first given only the CT and then, three months later the CT with photo. A further control group examined only the CTs, as in routine practice. The hypothesis: When a radiologist is exposed to the individual patient, and not only to an anatomical finding on a scan, she will be more conscious of her own responsibility, hence findings will be more thorough and diagnosis more accurate. And in fact, this is what he found. The radiologists reported that they had more empathy with the patient, and that they “felt like doctors”. And they spotted more irregularities and pathological findings when they had the CT and photo in front of them than when they were only looking at the CT (Turner and Hadas-Halpern 2008).
So how about showing researchers in basic and preclinical biomedicine photos of patients with the disease they are currently investigating in a model of the disease? Continue reading
I failed to reproduce the results of my experiments! Some of us are haunted by this horror vision. The scientific academies, the journals and in the meantime the sponsors themselves are all calling for reproducibility, replicability and robustness of research. A movement for “reproducible science” has developed. Sponsorship programs for the replication of research papers are now in the works.In some branches of science, especially in psychology, but also in fields like cancer research, results are now being systematically replicated… or not, thus we are now in the throws of a “reproducibility crisis”.
Now Daniel Fanelli, a scientist who up to now could be expected to side with those who support the reproducible science movement, has raised a warning voice. In the prestigious Proceedings of the National Academy of Sciences he asked rhetorically: “Is science really facing a reproducibility crisis, and if so, do we need it?” So todayon the eve, perhaps, of a budding oppositional movement, I want to have a look at some of the objections to the “reproducible science” mantra. Is reproducibility of results really the fundament of scientific methods? Continue reading