- Let’s get this out of the way: Reproducibility is a cornerstone of science: Bacon, Boyle, Popper, Rheinberger
- A ‘lexicon’ of reproducibility: Goodman et al.
- What do we mean by ‘reproducible’? Open Science collaboration, Psychology replication
- Reproducible – non reproducible – A false dichotomy: Sizeless science, almost as bad as ‘significant vs non-significant’
- The emptiness of failed replication? How informative is non-replication?
- Hidden moderators – Contextual sensitivity – Tacit knowledge
- “Standardization fallacy”: Low external validity, poor reproducibility
- The stigma of nonreplication (‘incompetence’)- The stigma of the replicator (‘boring science’).
- How likely is strict replication?
- Non-reproducibility must occur at the scientific frontier: Low base rate (prior probability), low hanging fruit already picked: Many false positives – non-reproducibility
- Confirmation – weeding out the false positives of exploration
- Reward the replicators and the replicated – fund replications. Do not stigmatize non-replication, or the replicators.
- Resolving the tension: The Siamese Twins of discovery & replication
- Conclusion: No scientific progress without nonreproducibility: Essential non-reproducibility vs . detrimental non-reproducibility
- Further reading
It is for good reason that researchers are the object of envy. When not stuck with bothersome tasks such as grant applications, reviews, or preparing lectures, they actually get paid for pursuing their wildest ideas! To boldly go where no human has gone before! We poke about through scientific literature, carry out pilot experiments that surprisingly almost always succeed. Then we do a series of carefully planned and costly experiments. Sometimes they turn out well, often not, but they do lead us into the unknown. This is how ideas become hypotheses; one hypothesis leads to those that follow, and voila, low and behold, we confirm them! In the end, sometimes only after several years and considerable wear and tear on personnel and material, we manage then to weave a “story” out of them (see also). Through a complex chain of results the story closes with a “happy end”, perhaps in the form of a new biological mechanism, but at least as a little piece to fit the puzzle, and it is always presented to the world by means of a publication. Sometimes even in one of the top journals. Continue reading
Based on research, mainly in rodents, tremendous progress has been made in our basic understanding of the pathophysiology of stroke. After many failures, however, few scientists today deny that bench-to-bedside translation in stroke has a disappointing track record. I here summarize many measures to improve the predictiveness of preclinical stroke research, some of which are currently in various stages of implementation: We must reduce preventable (detrimental) attrition. Key measures for this revolve around improving preclinical study design. Internal validity must be improved by reducing bias; external validity will improve by including aged, comorbid rodents of both sexes in our modeling. False-positives and inflated effect sizes can be reduced by increasing statistical power, which necessitates increasing group sizes. Compliance to reporting guidelines and checklists needs to be enforced by journals and funders. Customizing study designs to exploratory and confirmatory studies will leverage the complementary strengths of both modes of investigation. All studies should publish their full data sets. On the other hand, we should embrace inevitable NULL results. This entails planning experiments in such a way that they produce high-quality evidence when NULL results are obtained and making these available to the community. A collaborative effort is needed to implement some of these recommendations. Just as in clinical medicine, multicenter approaches help to obtain sufficient group sizes and robust results. Translational stroke research is not broken, but its engine needs an overhauling to render more predictive results.
Read the full article at the Publishers site (STROKE/AHA). If your library does not have a subscription, here is the Authors Manuscript (Stroke/AHA did not allow me to even pay for open access, as it is ‘a special article…’).
Using metaanalysis and computer simulation we studied the effects of attrition in experimental research on cancer and stroke. The results were published this week in the new meta-research section of PLOS Biology. Not surprisingly, given the small sample sizes of preclinical experimentation, loss of animals in experiments can dramatically alter results. However, effects of attrition on distortion of results were unknown. We used a simulation study to analyze the effects of random and biased attrition. As expected, random loss of samples decreased statistical power, but biased removal, including that of outliers, dramatically increased probability of false positive results. Next, we performed a meta-analysis of animal reporting and attrition in stroke and cancer. Most papers did not adequately report attrition, and extrapolating from the results of the simulation data, we suggest that their effect sizes were likely overestimated. Continue reading
This has been a week chock-full of bias! First nature ran a cover story on it, with an editorial, and a very nice introduction into the subject by Regina Nuzzo. Then Malcolm Macleod and colleagues published a perspective in Plos Biology demonstrating limited reporting of measures to reduce the risk of bias in life sciences publications, and that there may be an inverse correlation between journal rank or prestige of the University from which the research originated and presence of measures to prevent bias. At the same time Jonathan Kimmelman’s group came out with a report in eLife in which they meta-analytically explored preclinical studies of an anticancer drug (sunitinib) to demonstrate that only a fraction of drugs that show promise in animals end up proving safe and effective in humans, partly because of design flaws, such as lack of prevention of bias, and partly due to positive publication bias. Both articles resulted in a worldwide media frenzy, including coverage by Nature and the lay press, here is an example from the Guardian. Retraction Watch interviewed Jonathan, while Malcolm spoke on BBC4.
Eine Sendung des Deutschlandfunk (ausgestrahlt 20.9.15) von Martin Hubert. Aus der Ankündigung: ‘Biomediziner sollen in ihren Laboren unter anderem nach Substanzen gegen Krebs oder Schlaganfall suchen. Sie experimentieren mit Zellkulturen und Versuchstieren, testen gewollte Wirkungen und ergründen ungewollte. Neuere Studien zeigen jedoch, dass sich bis zu 80 Prozent dieser präklinischen Studien nicht reproduzieren lassen.’ Hier der Link zum Audiostream bzw. zum Transkript.
(German only, sorry!)
The crisis in scientific reproducibility has crystalized as it has become increasingly clear that the faithfulness of the majority of high-profile scientific reports is with little foundation, and that the societal burden of low reproducibility is enormous. In todays issue of Nature, C. Glenn Begley, Alastair Buchan, and myself suggest measures by which academic institutions can improve the quality and value of their research. To read the article, click here.
Our main point is that research institutions that receive public funding should be required to demonstrate standards and behaviors that comply with “Good Institutional Practice”. Here is a selection of potential measures, implementation of which shuld be verified, certified and approved by major funding agencies.
Compliance with agreed guidelines: Ensure compliance with established guidelines such as ARRIVE, MIAME, data access (as required by National Science Foundation and National Institutes of Health, USA).
Full access to the institution’s research results: Foster open access and open data; preregistration of preclinical study designs.
Electronic laboratory notebooks: Provide electronic record keeping compliant with FDA Code of Federal Regulations Title 21 (CFR Title 21 part 11). Electronic laboratory notebooks allow data and project sharing, supervision, time stamping, version control, and directly link records and original data.
Institutional Standard for Experimental Research Conduct (ISERC): Establish ISERC (e.g. blinding, inclusion of controls, replicates and repeats etc); ensure dissemination, training and compliance with IMSERC.
Quality management: Organize regular and random audits of laboratories and departments with reviews of record keeping and measures to prevent bias (such as randomization and blinding).
Critical incidence reporting: Implement a system to allow the anonymous reporting of critical incidences during research. Organize regular critical incidence conferences in which such ‘never events’ are discussed to prevent them in the future and create a culture of research rigor and accountability.
Incentives and disincentives: Develop and implement novel indices to appraise and reward research of high quality. Honor robustness and mentoring as well as originality of research. Define appropriate penalties for substandard research conduct or noncompliance with guidelines. These might include decreased laboratory space, lack of access to trainees, reduced access to core facilities.
Training: Establish mandatory programs to train academic clinicians and basic researchers at all professional levels in experimental design, data analysis and interpretation, as well as reporting standards.
Research quality mainstreaming: Bundle established performance measures plus novel institution-unique measures to allow a flexible, institution-focused algorithm that can serve as the basis for competitive funding applications.
Research review meetings: create forum for routine assessment of institutional publications with focus on robust methods: the process rather than result.