Category: Quality – Predictiveness

Overcoming negative (positive) publication bias

1-negativeF1000 Research starts initiative to overcome ‘positive publication bias’ (aka ‘negative publication bias)’. Until end of August publication fees  are waived for submission of Null results.

Only data that are available via publications—and, to a certain extent, via presentations at conferences—can contribute to progress in the life sciences. However, it has long been known that a strong publication bias exists, in particular against the publication of data that do not reproduce previously published material or that refute the investigators’ initial hypothesis. The latter type of contradictory evidence is commonly known as ‘negative data.’ This slightly derogatory term reflects the bias against studies in which investigators were unable to reject their null hypothesis (H0), a tool of frequentist statistics that states that there is no difference between experimental groups.

Researchers are well aware of this bias, as journals are usually not keen to publish the nonexistence of a phenomenon or treatment effect. They know that editors have little interest in publishing data that refute, or do not reproduce, previously published work—with the exception of spectacular cases that guarantee the attention of the scientific community, as well as garner extra citations (Ioannidis and Trikalinos, 2005). The authors of negative results are required to provide evidence for failure to reject the null hypothesis under numerous conditions (e.g., dosages, assays, outcome parameters, additional species or cell types), whereas a positive result would be considered worthwhile under any of these conditions . Indeed, there is a dilemma: one can never prove the absence of an effect, because, as Altman and Bland (1995) remind us, ‘absence of evidence is not evidence of absence’.

Several journals have already opened their pages to ‘negative’ results. For example, the  Journal of Cerebral Blood Flow and Metabolism: Fighting publication bias: introducing the Negative Results section  publishes such studies as a one-page summary (maximum 500 words, two figures) in the print edition of the journal, and the accompanying full paper online.

Power failure

 

powerfistIn a highly cited paper in 2005, John Ioannidis answered the question ‘Why most published research findings are false’  (PLoS Med. 2, e124). The answer, in one sentence, is ‘because of low statistical power and bias’. A current analysis in Nature Reviews Neuroscience ‘Power failure: why small sample size undermines the reliability of neuroscience’ (advance online publication, Ioannidis is a coauthor) now focuses on the neurosciences, and provides empirical evidence that in a wide variety of neuroscience fields (including imaging and animal modeling) exceedingly low statistical power and hence very low positive predictive values are the norm. This explains low reproducibility (e.g. special issue in Exp. Neurol. with (lack of) reproduction in spinal cord injury research, Exp Neurol. 2012 Feb;233(2):597-605) and inflated effect sizes. Besides this meta-analysis on power in neuroscience research, the article also contains a highly readable primer on the concepts of power, positive predictive value, type I and II error, as well as effect size. Must read.

 

Diversity outcross mice

mice

Most rodent models of disease (in stroke research, anyway) use young, healthy male, inbred mouse strains kept under specific pathogen free (SPF) conditions, restricted antigen exposure in their environment,  and on a diet optimized for maximum reproduction rates  (high in antioxidants, trace elements and other supplements, etc.). It is like studying cohorts of 12 year old male identical twins kept on an identical health diet in a single  sealed room, without any contact to the outside world  (the ‘plastic bubble’). What may be good for reproducible modeling, is potentially problematic for translational research, as patients often have comorbidities (e.g. hypertension and diabetes in stroke), already take various medicines, are elderly, and include females… Thus, external validity of the models often is low, at least partially explaining some of the failures when moving from promising new therapeutic strategies in rodents to real life patients in randomized clinical trials. Fortunately, external validity can be improved by studying comorbid animals at advanced age and of both genders. It is trickier in rodents to produce a mature immune system  that had contact with pathogens and multiple antigens. The answer to reduced genetic diversity  may be to use populations specifically developed to provide wide genetic variability, such as the diversity outbred population or the partially inbred collaborative cross strains developed by the Jackson Laboratory. However, in my field  (stroke research), which is particularly hit hard by the ‘translational roadblock’ I have not seen a single study making use of these strains.

Genomic responses in mouse models poorly mimic human inflammatory diseases

Genomic responses in mouse models poorly mimic human inflammatory diseases.

Fig. 1.

This PNAS paper has been featured in the lay press – from the New York Times  in the US to Der Spiegel in Germany. Its major conclusion is indeed disturbing:  ‘Here, we show that, although acute inflammatory stresses from different etiologies result in highly similar genomic responses in humans, the responses in corresponding mouse models correlate poorly with the human conditions and also, one another.’

It should be noted that the findings cannot be simply generalized to other models and settings. Importantly, the authors have studied genomic responses of blood cells , not mechanisms directly related to the inflammatory stimulus (burn, trauma, etc.). In addition, only the genomic response of blood cells was assessed, not translation of these genes into proteins. In the discussion the authors ignore the existing literature, where a number of studies in candidate approaches have shown congruent (qualitatively and sometimes even qualitatively) responses in patient material and the corresponding mouse models (compare for example http://jem.rupress.org/content/198/5/725.long and http://stroke.ahajournals.org/content/39/1/237.long.)

Nevertheless, particularly with respect to immune cells in the blood there are obvious and drastic differences between mouse and man – sending blood from a healthy 1 month old mouse to a clinical routine laboratory would return the diagnosis of an acute lympatic neoplasia. Laboratory rodents are raised in a pathogen free environemt (SPF), their immune system is ‘untrained’, lymphatic, and immature.

Thus, Seok et al. expose an important caveat in the interpretation of rodent studies. There is an urgent need for translational research to use biomarkers to expose similarities and dissimilarities in the pathobiology of rodents and humans, as well as to improve the predictiveness of extrapolating from mouse to man with respect to responses to novel theapies. In addition, we need to increase the external validity of our models, by using rodents which have the same comorbidities, age, and environmental exposure as our patients.