Research Shows That Published Versions Of Papers In Costly Academic Titles Add Almost Nothing To The Freely-Available Preprints They Are Based On
from the all-that-glitters-is-not-gold dept
The open access movement believes that academic publications should be freely available to all, not least because most of the research is paid for by the public purse. Open access supporters see the high cost of many academic journals, whose subscriptions often run into thousands of dollars per year, as unsustainable for cash-strapped libraries, and unaffordable for researchers in emerging economies. The high profit margins of leading academic publishers -- typically 30-40% -- seem even more outrageous when you take into account the fact that publishers get almost everything done for free. They don't pay the authors of the papers they publish, and rely on the unpaid efforts of public-spirited academics to carry out crucial editorial functions like choosing and reviewing submissions.
Academic publishers justify their high prices and fat profit margins by claiming that they "add value" as papers progress through the publication process. Although many have wondered whether that is really true -- does a bit of sub-editing and design really justify the ever-rising subscription costs? -- hard evidence has been lacking that could be used to challenge the publishers' narrative. A paper from researchers at the University of California and Los Alamos Laboratory is particularly relevant here. It appeared first on arXiv.org in 2016 (pdf), but has only just been "officially" published (paywall). It does something really obvious but also extremely valuable: it takes around 12,000 academic papers as they were originally released in their preprint form, and compares them in detail with the final version that appears in the professional journals, sometimes years later, as the paper's own history demonstrates. The results are unequivocal:
We apply five different similarity measures to individual extracted sections from the articles' full text contents and analyze their results. We have shown that, within the boundaries of our corpus, there are no significant differences in aggregate between pre-prints and their corresponding final published versions. In addition, the vast majority of pre-prints (90%-95%) are published by the open access pre-print service first and later by a commercial publisher.
That is, for the papers considered, which were taken from the arXiv.org preprint repository, and compared with the final versions that appeared, mostly in journals published by Elsevier, there were rarely any important additions. That applies to titles, abstracts and the main body of the articles. The five metrics applied looked at letter-by-letter changes between the two versions, as well as more subtle semantic differences. All five agreed that the publishers made almost no changes to the initial preprint, which nearly always appeared before the published version, minimizing the possibility that the preprint merely reflected the edited version.
The authors of the paper point out a number of ways in which their research could be improved and extended. For example, the reference section of papers before and after editing was not compared, so it is possible that academic publishers add more value in this section; the researchers plan to investigate this aspect. Similarly, since the arXiv.org papers are heavily slanted towards physics, mathematics, statistics, and computer science, further work will look at articles from other fields, such as economics and biology.
Such caveats aside, this is an important result that has not received the attention it deserves. It provides hard evidence of something that many have long felt: that academic publishers add almost nothing during the process of disseminating research in their high-profile products. The implications are that libraries should not be paying for expensive subscriptions to academic journals, but simply providing access to the equivalent preprints, which offer almost identical texts free of charge, and that researchers should concentrate on preprints, and forget about journals. Of course, that means that academic institutions must do the same when it comes to evaluating the publications of scholars applying for posts.
If it was felt that more user-friendly formats were needed than the somewhat austere preprints, it would be enough for funding organizations to pay third-party design companies to take the preprint texts as-is, and simply reformat them in a more attractive way. Given the relatively straightforward skills required, the costs of doing so would be far less than paying high page charges, which is the main model used to fund so-called "gold" open access journals, as opposed to the "green" open access based on preprints freely available from repositories.
In theory, gold open access offers "better" quality texts than green open access, which supposedly justifies the higher cost of the former. What the research shows is that when it comes to academic publishing, as in many other spheres, all that glitters is not gold: humble preprints turn out to be almost identical to the articles later published in big-name journals, but available sooner, and much more cheaply.
Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: academic journals, knowledge, open access, paywalls, pre-prints
Reader Comments
Subscribe: RSS
View by: Time | Thread
[ link to this | view in chronology ]
Are there any models where libraries or funding agencies ...
The prerquisite is, of course, to accept that open-source is not free, but requires an (up-front) investment of some of the money saved from paid subscriptions (later).
[ link to this | view in chronology ]
Re: Are there any models where libraries or funding agencies ...
In short, university libraries are paying for *both* the subscription journals and the services and platforms that enable open-access publishing and data sharing. While some have suggested that we cancel subscriptions and channel those funds to more support for OA and open source platforms, to my knowledge nobody has done this in any large-scale way.
[ link to this | view in chronology ]
Somewhat misleading
My most recent paper, which was just accepted for publishing in Physical Review B, was sent out to two referees. Referee A had not a whole lot to say but pointed out an explanation that we had provided for an method was unclear to non-experts. Referee B was perhaps overly thorough but actually pointed out a few instances where specific word choices could lead to the incorrect conclusions being made. He or she also found a couple minor stylistic errors that had slipped through our editing.
While the total number of words changed was probably under 5% (maybe even closer to 1 or 2%), the revised manuscript is certainly better than the pre-print version.
The next step is for the journal to copy-edit the manuscript. This step usually consists of changing British English to American English and spelling out some abbreviations or abbreviating other words but can sometimes uncover typos that made it through peer editing. In any case, I agree that this is less useful than the peer review.
It is certainly ridiculous that the public has to triple pay for research (they pay me to do it, they pay for me to access and submit to journals, and they have to pay if they want to access the research). However, I have yet to see an alternative to the current peer review process that is facilitated by the journals.
In many cases, I have encountered papers on the arXiv which are completely incorrect. The papers in question have not been published and likely wouldn't be published without significant changes. I myself have manuscripts on the arXiv that contain small errors which have been fixed in the published versions. Depending on the journal we can usually replace the pre-print version with the published version after some length of time (6 months I believe) but this is not always done.
[ link to this | view in chronology ]
Re: Somewhat misleading
The outstanding problem is the use of publications in prestigious as a measure of academic ability when academics seek new posts.
[ link to this | view in chronology ]
Re: Somewhat misleading
[ link to this | view in chronology ]
The way forward is to have a transparant system to rank the importance of papers that does not depend on the chosen magazine. This way, there can be a lot of papers in any open access repository without having the importance of the few critical articles being watered down.
[ link to this | view in chronology ]
All that is needed is just two things.
First as already pointed out is an open platform equivalent to wiki that has the logistics hammered out and middlemen proof.
Second is to beat the various admins and funding bodies about the head and shoulders until they get into their syphilis addled heads to stop looking at the colour of the covers and actually to their jobs.
[ link to this | view in chronology ]
Re:
What prevents a wiki from being set up for this purpose? Which new software features are required, and has anyone written or requested them?
Would Reddit or StackOverflow-style software be better suited? They have voting, comments etc.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
What would be more interesting is a study comparing the papers published to those NOT published by some metrics of quality. In other words, measure if the journals are performing a valuable "gatekeeper" function or not.
[ link to this | view in chronology ]
Moody being Moody
[ link to this | view in chronology ]
Re: Moody being Moody
[ link to this | view in chronology ]
Re: Re: Moody being Moody
[ link to this | view in chronology ]
Re: Re: Moody being Moody
[ link to this | view in chronology ]
Re: Re: Re: Moody being Moody
[ link to this | view in chronology ]
Re: Re: Re: Re: Moody being Moody
But I've seen you explode into paroxysms of homophobia at the slightest provocation, because dicks trigger you.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Moody being Moody
If you choose to be triggered by the usage of a common derivative for Richard, that's on you.
[ link to this | view in chronology ]
Scholarly Kitchen perspective
[ link to this | view in chronology ]
Re: Scholarly Kitchen perspective
[ link to this | view in chronology ]
Re: Re: Scholarly Kitchen perspective
https://twitter.com/Preprints_org/status/974252604338458624
[ link to this | view in chronology ]
One of the ironies of this...
There are people who could do this without even blinking. And while there are numerous other worthy causes, making all academic knowledge free would serve those too -- maybe not today, but certainly in the future.
[ link to this | view in chronology ]
Re: One of the ironies of this...
[ link to this | view in chronology ]
All the journals I paid did a great job on my manuscript
So far, all the journals I paid did a great job on my manuscript and it really improved the outcome of the study. However, I know there could be some research articles websites out there who are set out for solely business and may not be effective.
[ link to this | view in chronology ]