The MEDLINE currently indexes 5,642 journals. PubMed comprises more than 24 million citations for biomedical literature from MEDLINE. My field is stroke research. Close to 30.000 articles were published in 2014 on the topic ‘Stroke’ (clinical and experimental), more than 20.000 of them were peer reviewed original articles in the English language (Web of Science). That amounts to more than 50 articles every day. In 2014, 1700 of them were rodent studies, a mere 5 per day. Does (can) anyone read them? And should we read them? Do researchers worldwide every day produce knowledge worth publishing in 50 articles?
A number of studies have investigated what fraction of academic papers are cited, or even harder to nail down, how many are actually read. Based on evidence for exceedingly low citation rates for individual publications, and the estimate that in general less than 20% of all papers receive more than 80% of all citations (‘Pareto’s law, which also applies to top tier journals like Science and Nature), already in 1990 David Hamilton concluded in Science “New evidence raises the possibility that a majority of scientific papers make negligible contributions to knowledge” . Some have since then argued that “No One Really Reads Academic Papers“.
Since 1990, the number of journals and publications has continued to grow exponentially. But is there more recent, quantitative evidence what happens to published papers? According to a study published in 2009, at least in medicine the picture does not look as bleak: Within a 5 year window (starting 2002), around 80 % of papers received at least 1 citation (in the humantities the corresponding number was 15 %!). The authors did not investigate how many of these were self citations, and importantly, they were unable to determine whether the cited studies were actually read by those who cited them.
So how do these numbers look for a typical, scholarly neuroscience journal? I happen to be a chief editor of the Journal of Cerebral Blood Flow and Metabolism (JCBFM), which has an impact factor of around 5, which is the average number of citations received per paper published in the journal during the two preceding years. In JCBFM, after 5 years more than 96% of paper received at least one citation.
Year Articles published Non-cited articles
2014 244 138
2013 245 21
2012 203 7
2011 250 16
2010 202 8
2009 194 2
That doesn’t look too bad, as it suggests that after 5 years most published papers get cited, if only by the authors… In addition, a journal paper may have impact beyond classical citations, for example by being featured in blogs or other social media (which can be captured by so called altmetrics). But still, most of us would suspect that many articles are cited but not read. Is there a way to quantify this feeling? In ‘Read before your cite!’ M. V. Simkin and V. P. Roychowdhury proposed a quantitative and objective method based on stochastic modeling for estimating what percentage of people who cited a paper had actually read it. The smart and conservative trick was to follow misprint distributions in citations. Their estimate is that only about 20% of citers read the original!
Regardless of all attempts to quantify what fraction of articles are read (or at least cited), it is clear from the sheer numbers (50 stroke articles per day!) that no one can stay on top of the literature in most biomedical fields. It is also highly likely (see a number of earlier posts) that a substantial fraction of this literature is of dubious quality, therefore does not contain knowledge which deserves to be read or communicated.
So why then are there so many publications? The tentative and straightforward answer is: To serve as currency in the academic world of careers, grant money, institutional reviews, etc. Cui bono? Clearly, the true profiteers of this system are the publishers. The only sensible way to stem this deluge of papers (which in its wake also floods those poor reviewers referreing the articles) is to devaluate the currency by which we appraise and reward biomedical research.
Note added June 2016: Siebert et al. argue that the amount of new information in biomedicine exceeds our ability to process it appropriately (‘Overflow in science and its implications for trust’ eLIFE ). Result: ‘ (1) the overflow in science is leading to concerns about quality of scientific outputs, (2) scientists often use reputation—of their colleagues or of a journal, for example—as a proxy for trustworthiness. ‘