Elsevier Says Downloading And Content-Mining Licensed Copies Of Research Papers 'Could Be Considered' Stealing

from the gotta-protect-that-39%-profit-margin dept

Elsevier has pretty much established itself as the most hated company in the world of academic publishing, a fact demonstrated most recently when all the editors and editorial board resigned from one of its top journals to set up their own, open access rival. A blog post by the statistician Chris H.J. Hartgerink shows that Elsevier is still an innovator when it comes to making life hard for academics. Hartgerink's work at Tilburg University in the Netherlands concerns detecting potentially problematic research that might involve data fabrication -- obviously an important issue for the academic world. A key technique he is employing is content mining -- essentially bringing together large bodies of text and data in order to extract interesting facts from them:

I am trying to extract test results, figures, tables, and other information reported in papers throughout the majority of the psychology literature. As such, I need the research papers published in psychology that I can mine for these data. To this end, I started 'bulk' downloading research papers from, for instance, [Elsevier's] Sciencedirect. I was doing this for scholarly purposes and took into account potential server load by limiting the amount of papers I downloaded per minute to 9. I had no intention to redistribute the downloaded materials, had legal access to them because my university pays a subscription, and I only wanted to extract facts from these papers.
He spread out the downloads over ten days so as not to hammer Elsevier's servers -- which in any case are doubtless pretty beefy given the 39% profit margin the company enjoys:
I downloaded approximately 30GB of data from Sciencedirect in approximately 10 days. This boils down to a server load of 35KB/s, 0.0021GB/min, 0.125GB/h, 3GB/day.
Elsevier's response to this super-considerate researcher is a classic:
Approximately two weeks after I started downloading psychology research papers, Elsevier notified my university that this was a violation of the access contract, that this could be considered stealing of content, and that they wanted it to stop. My librarian explicitly instructed me to stop downloading (which I did immediately), otherwise Elsevier would cut all access to Sciencedirect for my university.
There are clear parallels with the situation that Aaron Schwarz found himself in, but with a key difference. Elsevier is not only stopping Hartgerink from carrying out his research, but threatening to cut off all access to the company's journals and books for everyone working at Tilburg University if he tries to continue. Alicia Wise, Elsevier's Director of Access & Policy, added the following comment on Hartgerink's blog post:
We are happy for you to text mind content that we publish via the ScienceDirect API, but not via screen scraping.
When she was asked why it was necessary to use the API, rather than simply downloading articles, she replied:
The reason that we require miners to use the API is so that we can meet their needs AND ALSO the needs of our human users who can continue to read, search and download articles and not have their service interrupted in any way.
But that doesn't make any sense when Hartgerink had taken such pains to avoid any such adverse affects. Moreover, another commenter noted that Elsevier’s API often fails to work, rendering it useless for content mining. Even when it does work:
In many cases the API returns only metadata in the XML, compared to the fulltext PDF I can access on the website. Simply downloading the paper via the normal web service for readers is easy -- much easier than using the API.
What is really at stake here is control. Elsevier wants to be acknowledged as the undisputed gatekeeper for all possible uses of the research it publishes -- most of which was paid for by the public through taxes. And as far as the company is concerned, daring to use that knowledge in new ways without additional permission is simply "stealing."

Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: chris hartgerink, downloads, knowledge, research
Companies: elsevier


Reader Comments

Subscribe: RSS

View by: Time | Thread


  1. identicon
    Glenn, 18 Nov 2015 @ 4:02am

    Trying to "lock up" *all* research papers--even those belonging to others--could be considered stealing, too.

    link to this | view in thread ]

  2. identicon
    Anonymous Coward, 18 Nov 2015 @ 4:25am

    Re:

    But when you're stealing from everyone for profit that's called "business".

    link to this | view in thread ]

  3. identicon
    Anonymous Coward, 18 Nov 2015 @ 4:25am

    this is a case for anonymous,
    just keep publishing their passwords until they have to die (and release the knowledge they have kidnapped)

    link to this | view in thread ]

  4. identicon
    annonymouse, 18 Nov 2015 @ 4:33am

    This is like the security guard at the bank saying mine.

    Most of the research is taxpayer funded so the odds of our broken governments pillorying this theif is a quantum leap from zero.

    Now if we could just find a couple of self serving corporations inconvenienced by this, then there would definitely be blood in the water.

    link to this | view in thread ]

  5. icon
    stimoceiver (profile), 18 Nov 2015 @ 5:16am

    I applaud the hackers behind the Library Genesis project.

    The project makes tens of thousands of peer reviewed journals, books, and other documents freely available through a search engine. The back end repository for all this data is a torrent pool, invisible to the search engine users, but open to participation via seeding or extending the pool.

    It mirrors quite a bit of otherwise paywalled content.

    And its dedicated to the memory of Aaron Swartz. How cool is that?

    link to this | view in thread ]

  6. icon
    DannyB (profile), 18 Nov 2015 @ 5:23am

    Could Be Considered Stealing

    I say that taking research papers, often written by overworked underpaid scientists, and often funded by public money, and locking them up behind a troll gate paywall for your own private enrichment . . .

    COULD BE CONSIDERED STEALING

    link to this | view in thread ]

  7. icon
    DannyB (profile), 18 Nov 2015 @ 5:29am

    Re: Re:

    But when it is business, you are 'adding value'.

    Elsevier adds value by putting a 'protecting' the content behind a troll gate paywall. Since you must pay to access the research, it must now (somehow) have become more valuable.

    But I suppose Elsevier is lazy. If they wanted to add even more value, the research papers would have DRM and you would only be able to view them on special viewer software that runs on Windows. (Don't all scientists run only Windows?)

    Copy / Paste and the ability to make screenshots would enable thieving pirates to read the research without paying through the nose.

    link to this | view in thread ]

  8. icon
    Agonistes (profile), 18 Nov 2015 @ 5:30am

    I wonder if this purveyor of academically significant words on paper might consider identity fraud human cloning using this mindset?

    link to this | view in thread ]

  9. identicon
    DrZZ, 18 Nov 2015 @ 5:53am

    Peter Murray-Rust has been fighting this for years

    One of the most vocal and detailed critic of these kinds of policies has been Peter Murray-Rust. He has been fighting Elsevier specifically for years (see here where he warns about signing Elsevier's "mining agreement" and here in a 2011 post where he details his "negotiations" with them. Even before that I believe that he got all of Cambridge University's access to Chemical Abstracts shut down because he was using "too much" data. Elsevier might be the worst, but even some scientific societies will crush research if it will make them some money.

    link to this | view in thread ]

  10. identicon
    Anonymous Coward, 18 Nov 2015 @ 5:58am

    Paging Wikileaks ... Paging Wikileaks ...

    link to this | view in thread ]

  11. identicon
    Anonymous Coward, 18 Nov 2015 @ 6:14am

    Ever noticed how the less the effort put into acquiring things the more possessive the attitude to the ownership of things things. Whenever I see Elsevier's name I think of Smaug, and his habit of going on a violent rampage if the treasure is touched by somebody else.

    link to this | view in thread ]

  12. identicon
    Anonymous Coward, 18 Nov 2015 @ 6:19am

    At this late stage in the game, all the smart academics are already boycotting Elsevier. The ones left over, not boycotting, are the dumb ones. It is time the dummies lost their academic positions.

    link to this | view in thread ]

  13. identicon
    Anonymous Coward, 18 Nov 2015 @ 6:26am

    Thieves....

    So the thief, Elsevier, is doing all they can to be a classic abuser and blame the people they're stealing from of being thieves.....

    link to this | view in thread ]

  14. identicon
    Anonymous Coward, 18 Nov 2015 @ 6:30am

    Re:

    While the smart one are boycotting Elsevier for the publication of new papers, they still need access to the historical papers that Elsevier control. The nature of human advancement is that it is built on the works of those who have gone before, and therefore access to Elevier's trove of existing papers is what Elsevier is leveraging to keep up their income stream.

    link to this | view in thread ]

  15. identicon
    Anonymous Anonymous Coward, 18 Nov 2015 @ 6:38am

    Re: Re:

    So in reality they are doomed. It's just going to take many decades, or maybe a century or two for them to die off?

    link to this | view in thread ]

  16. identicon
    Anonymous Coward, 18 Nov 2015 @ 7:01am

    Re: Re:

    So what you are saying is that Elsevier is anti science. They are actively attempting to shut down scientific research. This could be considered breaking any and all contracts they have with the submitters of papers which means they have no standing. Is this correct?

    link to this | view in thread ]

  17. identicon
    Anonymous Coward, 18 Nov 2015 @ 7:37am

    Of course it can be considered stealing. If you're an asshole.

    link to this | view in thread ]

  18. identicon
    Alicia Wise, 18 Nov 2015 @ 7:43am

    Elsevier supports content mining, contra your salacious headline

    Hi everyone,

    As I mentioned in the comment thread to the original blog post, the reason that we require miners to use our API is so that we can meet their needs AND ALSO the needs of our human users. Our platforms provide access to 11million pieces of content, serves millions of researchers, and provides infrastructure for a number of services including ScienceDirect, Scopus, ClinicalKey. We are not alone in providing an API for this sort of high-volume content-intensive service – others including Wikipedia and Twitter take the same approach. We also appreciate that researchers might wish to text mine across publisher platforms, and this is why we also participate in the multi-publisher cross-platform text and data mining service by CrossRef (http://tdmsupport.crossref.org/).

    With kind wishes,
    Alicia

    Dr Alicia Wise
    Elsevier
    Director of Access & Policy
    a.wise@elsevier.com
    @wisealic

    link to this | view in thread ]

  19. identicon
    Anonymous Coward, 18 Nov 2015 @ 8:08am

    Re: Elsevier supports content mining, contra your salacious headline

    Dr. Wise,

    It's also been stated that there are short falls to your API, and that this user took all reasonable actions to use minimal resources. If your alternate service to provide for high volume is not usable for whatever reason, then it's fully reasonable to use the normal service, so long as you take as much care as this user is stated to have to prevent harming it, for your research.

    If using your normal service in an automated manner is a problem, then please explain why, so that we can take appropriate care. Please do not be afraid to give us the technical reason why this use is an issue, as we will likely be able to understand the issue, and possibly propose a fix to this issue.

    Sincerely,
    Anonymous Coward

    link to this | view in thread ]

  20. identicon
    Quiet Lurcker, 18 Nov 2015 @ 8:18am

    Sounds like we've got a potential BS detector in development, and the publisher is trying to play fast and loose with its policies to prevent that happening.

    Are they maybe, oh, I don't know, hiding something?

    link to this | view in thread ]

  21. identicon
    Anonymous Coward, 18 Nov 2015 @ 8:19am

    so the subscription fees that were paid by the University amounted to nothing, except for more cash in Elsevier's pocket for doing nothing

    link to this | view in thread ]

  22. identicon
    Anonymous Coward, 18 Nov 2015 @ 8:20am

    Elsevier should kindly shut the fuck up, after all the knowledge it profiteers from and is often 'stolen' from acedemic culture.

    link to this | view in thread ]

  23. identicon
    Anonymous Coward, 18 Nov 2015 @ 8:35am

    11 milion pieces of content

    Dear Elsevier,

    11 Million pieces of content could easily fit on a 5 GB DVD or two, or a cheap 64GB usb drive.

    Where is your option for universities to get ALL documents on a stick for internal distribution, mining and other 'approved' purposes. It saves a bundle on server hosting costs too.

    You would have to trust your sole suppliers of pieces of content not to distribute it to the world. But that's the premise of copyright, isn't it?

    link to this | view in thread ]

  24. icon
    Bill Jackson (profile), 18 Nov 2015 @ 8:53am

    Nobel Prize Committee

    The Committee could strike a powerful blow against Elsevier et al, by adopting a policy that only scholarly articles submitted to Open Access Academic Publishers would be reviewed by the Committee. If they did this they would repeat and emphasize the creative act that made the Nobel Prize the most important body for advancing Science in history.
    Who could dare stand against them?

    link to this | view in thread ]

  25. identicon
    Anonymous Coward, 18 Nov 2015 @ 9:27am

    Re: Nobel Prize Committee

    That does nothing to make available 1 to 2 centuries worth of knowledge locked up by the existing academic publishers. Note also it will take at least that long for all the papers they hold to enter the public domain, Micky mouse permitting.

    link to this | view in thread ]

  26. icon
    Bill Jackson (profile), 18 Nov 2015 @ 9:29am

    Define a successful parasite?

    That is an entity that does nothing except suck the life out of others:-

    Elsevier.

    link to this | view in thread ]

  27. icon
    voiceofReason (profile), 18 Nov 2015 @ 10:05am

    Re: Elsevier supports content mining, contra your salacious headline

    I know I'm going to sound like someone working at your GC's office down the hall, but to me, I think all of this boils down to several basic questions:

    1) Did Elsevier obtain this information in accordance with the law and with existing contractual rights and obligations, yes or no?

    2) Does Elsevier have property rights in this information, yes or no?

    3) Do Elsevier's protocols for allowing access to this information comply with law and with existing contractual rights and obligations, yes or no?

    If the answer to all of these questions is "yes," then any rights to access that Elsevier grants in addition to what it is obligated, is irrelevant.

    All of you folks, if you think Elsevier needs to do more based on a "moral imperative," I can think of a lot more things that people don't do voluntarily that they should which have a greater impact on humanity. Go after them first. If I read one more anonymous cowrard plugging away at what, in his not so humble opinion, some random company "should" or "shouldn't" do, I will barf.

    link to this | view in thread ]

  28. identicon
    Anonymous Coward, 18 Nov 2015 @ 10:17am

    And this is threatening a contract violation...

    There is no limitation in the sample contract stating a definitive cap to the number of papers you may download. In fact, it says...
    Each Authorized User may:

    * access, search, browse and view the Subscribed Products;
    * print, download and store a reasonable portion of individual items from the Subscribed Products for the exclusive use of such Authorized User;

    While there is a clause
    Except as expressly stated in this Agreement or otherwise permitted in writing by Elsevier, the Subscriber and its Authorized Users may not:
    ...
    * use any robots, spiders, crawlers or other automated downloading programs, algorithms or devices to continuously and automatically search, scrape, extract, deep link, index or disrupt the working of the Subscribed Products;


    But that, too, was not being violated. No evidence has been put forth of the use of automated downloading or of disruption of services.

    So what we have here is a simple case of someone consuming a much larger amount of the services Elsevier provides than normal, while still not violating the contract terms.

    In comcast terms, "he violated our unannounced bandwidth cap and must be terminated".

    link to this | view in thread ]

  29. icon
    Bill Jackson (profile), 18 Nov 2015 @ 10:47am

    Re: Re: Nobel Prize Committee

    True enough, what is needed is to revert the copyright to the authors, so Elsevier is stripped of those rights of ransom.
    In addition, declaring all prior academic publications as open sourced via changes to copyright laws should be done.

    link to this | view in thread ]

  30. identicon
    DrZZ, 18 Nov 2015 @ 11:09am

    Re: Re: Elsevier supports content mining, contra your salacious headline

    2) Does Elsevier have property rights in this information, yes or no?

    Depends on exactly how you interpret "this information" but almost certainly no. Elsevier has the copyright on the specific expression written by the authors, but it does not have any property rights to the underlying facts and information. The big beef is that they are trying to get such rights by only letting you read their text if you sign a license that has such terms in it.

    3) Do Elsevier's protocols for allowing access to this information comply with law and with existing contractual rights and obligations, yes or no?

    I am not aware of any copyright law, anywhere that gives the copyright holder control of how the facts and information in the work is used by someone who legally accesses the work, thus there is no copyright law basis for the distinction between reading and mining. They get this distinction in by putting into the licensing agreement. Is it legal to put it in the licensing agreement? My understanding is at least in the UK, it is not legal. Might be legal in the Netherlands.

    link to this | view in thread ]

  31. icon
    voiceofReason (profile), 18 Nov 2015 @ 11:45am

    Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    Quote: Depends on exactly how you interpret "this information" but almost certainly no. Elsevier has the copyright on the specific expression written by the authors, but it does not have any property rights to the underlying facts and information. The big beef is that they are trying to get such rights by only letting you read their text if you sign a license that has such terms in it.

    Response: Agreed. Rephrasing the question, does Elsevier have a legal or contractual obligation to provide non-copyrightable facts and information that has been organized in the way that this information is organized?

    If it does, what is the extent of that obligation?

    link to this | view in thread ]

  32. identicon
    DrZZ, 18 Nov 2015 @ 12:38pm

    Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    There is certainly no question that it is legal to restrict access to copyrighted material to authorized people, so there is no legal or contractual obligation to provide unauthorized people with the facts and information (although see note below). That isn't at issue in this case because there is no question that the researcher in question could read any paper he had. The issue is Elsevier wants to create a right to control how you use your legal access via the license agreement. They want to create a separate category of use from just reading and tell you that anything that falls into the separate category of text mining has to use a separate interface with additional agreements that at least according to some, essentially force you to acknowledge they you are using THEIR content, not un-copyrightable facts and information. As others have mentioned in this thread, the separation is not clearly spelled out (at least from an end user perspective) and does not seem to be related to server load or other practical issues. Others think that even trying to make the distinction between reading a paper and analyzing a paper via computer is absurd and only makes sense as an attempt to gain control over information that isn't yours. These folks have convinced at least the UK legislature to make such contract terms illegal.

    Note: the one caveat to the first sentence is that Elsevier does have a program where authors can pay a fee to make their paper free and open access. Peter Murray-Rust and others found numerous examples of papers where such fees were paid and yet there still was a charge to access the papers. Elsevier claimed it was due to some bugs (funny how the bugs only go one way) and I don't know how many papers were affected or if the problem persists, but there were certainly concrete cases where Elsevier violated their agreement with the author. Come to think of it, the best way to get stats is to use some kind of web crawler to scan though all the open papers, which of course Elsevier says violates your license agreement. Hmmmm.

    link to this | view in thread ]

  33. icon
    tqk (profile), 18 Nov 2015 @ 12:44pm

    The reason that we require miners to use the API is so that we can meet their needs AND ALSO the needs of our human users who can continue to read, search and download articles and not have their service interrupted in any way.

    As a data center sysadmin with ca. thirty years in the trenches, this is bullshit. She's a corporate liar. I'd discount anything she says as corporate PR BS. Elsevier lost the moral high ground long ago, but they're desperate to not learn they're morally and ethically bankrupt. There's too much money at stake for them to acknowledge the facts of reality. She's been told to say this and has no idea what she's talking about. She's saying it because her employer told her to.
    What is really at stake here is control.

    Yes. The corporate bottom line depends on their not accepting the truth of the situation. Elsevier's shareholders should be ashamed for consorting with the likes of this. Some people can ignore anything as long as it's to their financial benefit.

    link to this | view in thread ]

  34. identicon
    Anonymous Coward, 18 Nov 2015 @ 12:55pm

    'Could Be Considered'

    Doing anything without Elsevier's permission 'could be considered' terrorism!

    link to this | view in thread ]

  35. icon
    Bill Jackson (profile), 18 Nov 2015 @ 1:04pm

    Elsevier is 'snake bit and doomed to die'

    Elsevier is facing the decline and fall of it's empire, in the way way much of bricks and mortar have died.

    Once scientists stop sending them papers, they will wither and die. It is underway now.

    link to this | view in thread ]

  36. identicon
    Anonymous Coward, 18 Nov 2015 @ 1:36pm

    Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    Elsevier does have a program where authors can pay a fee to make their paper free and open access.

    And the fee that Elsevier considered reasonable is why an entire editorial team has resigned to start an open access journal.see “The editor had requested a price of 400 euros, an APC that is not sustainable”, where according to Elsevier:
    The article publishing charge at Lingua for open access articles is 1800 USD. The editor had requested a price of 400 euros, an APC that is not sustainable. Had we made the journal open access only and at the suggested price point, it would have rendered the journal no longer viable – something that would serve nobody, least of which the linguistics community.


    As far as I know we as discussing a per page charge.

    link to this | view in thread ]

  37. identicon
    Nomad of Norad, 18 Nov 2015 @ 3:33pm

    What would it take to immediately take the ball away from Elsevier?

    What would it take to retroactively make all the papers public domain or otherwise open and freely distributable? I have seen floated the idea that, since all the research behind the papers, and thus the papers themselves, are paid for out of taxpayer money, that that means the government or governments could presumably pass a law stating that ALL such papers, going back to the start of the collection, are hereby declared open-access and that they MUST BE made publicly available to whoever has need of them.

    link to this | view in thread ]

  38. icon
    Bill Jackson (profile), 18 Nov 2015 @ 4:12pm

    Re: What would it take to immediately take the ball away from Elsevier?

    changes to copyright law or the declaration that authors agreement with Elsevier is void and any copyright reverts to the author(s). After all, Elsevier did not pay for them

    link to this | view in thread ]

  39. identicon
    Anonymous Coward, 18 Nov 2015 @ 4:20pm

    Copying is not theft. Two plus two doesn't equal five.

    link to this | view in thread ]

  40. identicon
    Anonymous Coward, 18 Nov 2015 @ 4:28pm

    Re: Re: Elsevier supports content mining, contra your salacious headline

    Ah, so we're back to the "if you can't afford it you don't deserve to be informed" argument.

    link to this | view in thread ]

  41. identicon
    New Mexico Mark, 18 Nov 2015 @ 5:20pm

    All I hear from Elsevier...

    "You are trying to kidnap what I have rightfully stolen."

    I think they should get used to disappointment.

    link to this | view in thread ]

  42. identicon
    Anonymous Coward, 19 Nov 2015 @ 2:18am

    Re: What would it take to immediately take the ball away from Elsevier?

    "What would it take to retroactively make all the papers public domain or otherwise open and freely distributable?"

    actually it looks VERY EASY:

    1) hack elsevier
    2) dump it to the net

    the net will then manage to translate everything to searchable open format
    and store it in multiple open repositories
    If we can do this with movies, tv series, software and videogames I do not see why this has not been done with humanity knowledge

    link to this | view in thread ]

  43. identicon
    Anonymous Coward, 19 Nov 2015 @ 2:25am

    o.m.g.

    "Alicia Wise
    a.wise@elsevier.com"

    did she just publish her email?

    link to this | view in thread ]

  44. identicon
    Anonymous Coward, 19 Nov 2015 @ 2:32am

    Re: Could Be Considered Stealing

    "taking research papers, written by overworked underpaid scientists, funded by public money,
    and locking them up behind a paywall for private profit"

    well, that sounds like a very profitable business model!

    where can we find the annual earnings of elsevier?
    can we buy stock or shares?

    link to this | view in thread ]

  45. identicon
    Daniel Suarez, 19 Nov 2015 @ 2:51am

    Kill Decision

    so,
    you do not want scientists to do a private search
    nor private data mining in their private and secure labs,
    but you want to have in a file each search associated to each account? just to help us?

    hm, that is interesting,
    scary but interesting anyway:

    -is this information safe? exactly how safe?
    -who does have authorized access to this information?
    -can this information be used to find out WHAT you are researching into?
    -can this information be used to find WHO is researching around specific topics?
    -can you think HOW MUCH this information is worth?
    -and how dangerous it is for scientists to be in such a list?

    have you read Daniel Suarez- Kill Decision?

    link to this | view in thread ]

  46. identicon
    Anonymous Coward, 19 Nov 2015 @ 2:56am

    Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    "MEGA scholar" by kim dot com

    I am sure someone can provide a service where authors can pay a fee to make their paper free and open access.

    let's say for a symbolic 1 cent

    link to this | view in thread ]

  47. identicon
    Anonymous Coward, 19 Nov 2015 @ 2:59am

    Re:

    you sound like the "safe spaces" kind of sheep
    which university would that be?

    link to this | view in thread ]

  48. icon
    Mark B. (profile), 19 Nov 2015 @ 7:04am

    Aaron

    FYI s/Schwarz/Swartz/

    link to this | view in thread ]

  49. identicon
    Susan Reilly, 20 Nov 2015 @ 5:51am

    Speaking out for copyright reform

    It's sad that a researcher downloading content which he has accessed legally and in a responsible manner should have his research stopped in its tracks in this manner. We need more researchers to speak out about this in order to make the case for copyright reform. Policy makers are saying that there is not enough evidence that researchers want to text and data mine and therefore licence solutions, such as the one offered by Elsevier, are sufficient. We're trying to gather such evidence by asking researchers to sign the Hague Declaration on Knowledge Discovery in the Digital Age http://thehaguedeclaration.com/

    link to this | view in thread ]

  50. icon
    tqk (profile), 20 Nov 2015 @ 1:06pm

    Re: What would it take to immediately take the ball away from Elsevier?

    ... since all the research behind the papers, and thus the papers themselves, are paid for out of taxpayer money, that that means the government or governments could presumably pass a law stating that ALL such papers, going back to the start of the collection, are hereby declared open-access and that they MUST BE made publicly available to whoever has need of them.

    I don't understand why universities haven't yet banded together to do this. It would be a sweet revenue stream that would fund their students' research and/or university operations. They could charge a tenth of what Elsevier is skimming off just to enrich third party investors, and still make enough to have plenty left over to fund their students' research.

    Letting Elsevier get away with this seems the silliest way possible, or else somebody's a getting sweet unearned free ride for the lousiest return imaginable.

    link to this | view in thread ]

  51. identicon
    Anonymous Coward, 20 Nov 2015 @ 1:32pm

    Re: Re: What would it take to immediately take the ball away from Elsevier?

    I don't understand why universities haven't yet banded together to do this.

    Because they will still need to pay the academic publisher for access to existing papers, and that is a big lever that these publishers wield over the universities.

    link to this | view in thread ]

  52. icon
    Bill Jackson (profile), 20 Nov 2015 @ 2:55pm

    Snake Bit and going to die = Elsevier

    Think magazines, that is what Elsevier sells, and they do not buy the content.
    They now practice 'microkerning', which means that each copy they supply to a college in electronic format has the letter spacing and word spacing changed a little. It is a form of text based steganography. By this method they police the subscribers by threat of service withdrawal. Every researcher makes scans and sends to friends by e-mail for free. Whenever Elsevier finds one, that analyze it to see who made the scan = threat.
    That is the club they bear - a product of a forced monopoly that would take government copyrifght action to recify.

    What governments should do is enforce zero copyright on publicly financed papers. Other paper financiers should do the same. It is in all their interests that papers all become open ASAP. It is only in Elseviers monopoly interest that the current systems persist.

    link to this | view in thread ]

  53. icon
    Bill Jackson (profile), 20 Nov 2015 @ 3:09pm

    Universities need to obtain the rights to all papers and publish them. Each University and research body needs to establish an electronic publication format, plus a peer review process among themselves. These reviewers will be paid the same as what Elsevier pays.
    A aggregation service is needed.
    I think I will sugggest t to Google.

    link to this | view in thread ]

  54. icon
    tqk (profile), 20 Nov 2015 @ 3:37pm

    Re: Re: Re: What would it take to immediately take the ball away from Elsevier?

    I don't understand why universities haven't yet banded together to do this.

    Because they will still need to pay the academic publisher for access to existing papers, and that is a big lever that these publishers wield over the universities.

    Yeah, it's the same problem as moving to Open Source software. The initial cost is expensive and disruptive short term. Explaining you'll make up that cost big time on the other side doesn't seem to fly for short term profit addicts.

    link to this | view in thread ]

  55. identicon
    Peter Murray-Rust, 22 Nov 2015 @ 5:00am

    Re: Snake Bit and going to die = Elsevier

    Would be interested - offlist - to know more about the details of microkerning. I have written a PDF reader (based on PDFBox) so could investigate this on the digital PDF. I won't say more in public.

    link to this | view in thread ]

  56. icon
    Bill Jackson (profile), 22 Nov 2015 @ 6:56am

    Re: Re: Snake Bit and going to die = Elsevier

    This is an old an mature concept to control leaks in diplomatic circles. Every time they print a document every copy is unique, but to the untrained eye they all look the same, so if a leaker make a protocopy and hands it out a scan and some analysis of word and letter gaps will reveal the leaker. It only take a few changes to cover a group of- say 10 people. This was developed in the 80's when word processors came out on many desks. You could for years do this with early word processors and even by the letter press method by inserting sliver of spacers in between certain words/letters, but that was labor intensive.
    It is so common now, that leakers have learned to retype and paraphrase things they want to leak.

    some clues here, https://www.google.ca/search?q=micro-kerning+document+control&oq=micro-kerning+document+control& amp;aqs=chrome..69i57.11966j0j8&sourceid=chrome&es_sm=93&ie=UTF-8

    link to this | view in thread ]

  57. identicon
    Anonymous Coward, 22 Nov 2015 @ 7:36am

    Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    There is certainly no question that it is legal to restrict access to copyrighted material to authorized people,

    Wrong, copyright is the right to control the production of new copies, and not what use is made of the copies once sold. Unfortunately this does not fit well in a digital world, where copyright is being distorted into control over information and the uses that can be made of it.

    link to this | view in thread ]

  58. icon
    voiceofReason (profile), 22 Nov 2015 @ 10:31am

    Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    More like insisting that access to a publisher's copy be unfettered,

    link to this | view in thread ]

  59. icon
    tqk (profile), 22 Nov 2015 @ 12:01pm

    Re: Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    More like insisting that access to a publisher's copy be unfettered,

    You mean insisting that access to research done by scientists and paid for by tuition and grants from taxpayers and philanthropists should be unfettered? I fail to see why anyone needs to suffer the likes of Elsevier sticking their rapaciously greedy, self-entitled noses in there. They've long overstayed their welcome.

    link to this | view in thread ]

  60. icon
    voiceofReason (profile), 22 Nov 2015 @ 1:13pm

    Re: Re: Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    Perhaps they have, but I have not noticed an abundance of corporations in this world who intentionally turn down legal ways of earning money. In fact that is why corporations exist. Perhaps you are confusing them with charities.

    As long as they have legally obtained this, it should not matter from a legal perspective if they obtained these texts from Warren Buffet or from teenage orphans living in a nunnery.

    Come back to me when you see these noble scientists foregoing nicer homes, cars, etc. if an opportunity arises

    link to this | view in thread ]

  61. identicon
    Anonymous Coward, 22 Nov 2015 @ 1:59pm

    Re: Re: Re: Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    As long as they have legally obtained this, it should not matter from a legal perspective if they obtained these texts from Warren Buffet or from teenage orphans living in a nunnery.

    The good old it is the law, a common justification for maintaining the status quo used by those benefiting from the labours of others, from nobles enforcing serfdom, through slavery to modern corporations. Problem is, when those with the money have to fall back on this justification, they are ignoring the winds of change, and will likely lose more by clinging to the old ways than they would if they adapted their business to the changes in society.

    link to this | view in thread ]

  62. icon
    tqk (profile), 22 Nov 2015 @ 2:21pm

    Re: Re: Re: Re: Re: Re: Re: Re: Re: Elsevier supports content mining, contra your salacious headline

    As long as they have legally obtained this, it should not matter from a legal perspective ...

    I'm not one to much care about legal perspectives. There are other, far more important, perspectives besides the legal one, such as morality and ethics. Legality should be the last resort tool you reach for. No, I don't expect corporations to care about morality and ethics (they're ill equipped to do so, and by law constrained from doing so), but we do, and we should. I understand Elsevier wants to enrich its shareholders. That doesn't at all mean it would be smart or correct for us to let them get away with what to me looks like outright theft stirred with slavery.
    Come back to me when you see these noble scientists foregoing nicer homes, cars, etc. if an opportunity arises

    Wow. Think of where Elsevier gets the content it publishes. Yes, those same "noble scientists" whose face you just spit on. They spent years, or decades, learning their chosen field and the tools they need to understand to practice in their field, competing against all those thousands of others who also want in, yet you can dismiss all of that with "they're greedy wanting nice homes and cars." What an asshole!

    I look forward to the day Elsevier enters chapter eleven bankruptcy.

    link to this | view in thread ]

  63. icon
    voiceofReason (profile), 22 Nov 2015 @ 6:45pm

    Who appointed you in charge of deciding how much money Elsevier should forego based on your morals?

    You sicken me, computer programming punk.

    link to this | view in thread ]

  64. icon
    tqk (profile), 22 Nov 2015 @ 7:13pm

    Re:

    Who appointed you in charge of deciding how much money Elsevier should forego based on your morals?

    Who appointed Elsevier in charge of deciding what scientists' published results would cost other researchers to keep up on on and continue their research?

    The Jews have a great word for this. It's chutzpah.

    You sicken me moocher, hanger on, know nothing person. I don't want to share a planet with the likes of you. You're a predatory a-hole which none of the rest of us wants to be here. Die screaming in a fire. Consider it an act of humanity. Or, just go away. You won't be missed.

    link to this | view in thread ]

  65. icon
    voiceofReason (profile), 22 Nov 2015 @ 7:52pm

    When I think of your blood pressure, I smile all over.

    Thanks! ;)

    link to this | view in thread ]

  66. identicon
    DrZZ, 23 Nov 2015 @ 10:58am

    Peter Murray-Rust's views.

    I don't think anyone has spent more time and effort working on the problems of text mining that Peter Murray-Rust. Part of the reason he has spent so much time is that he does take laws, licenses, and contracts very seriously, although he does have strong views on how currently many of these are very damaging to the practice of science. He has put together a series of posts that detail his experiences and views related to this matter that can be found starting here

    link to this | view in thread ]

  67. icon
    Bill Jackson (profile), 23 Nov 2015 @ 11:15am

    Peter Murray-Rust's views

    That is a very good link, full of links to others.
    I will spread it around.

    link to this | view in thread ]

  68. identicon
    Peter Murray-Rust, 24 Nov 2015 @ 12:43am

    Re: Re: Re: Snake Bit and going to die = Elsevier

    Bill, Thanks

    I knew about micro-kerning and its purpose - I was specifically interested in the actual algorithms used - was it glyph widths, or heights, was it inter-character-spacing , etc.

    If so, let me know.

    (There's also the cruder annotation of the name of the library subscribing. )

    link to this | view in thread ]

  69. icon
    The Wanderer (profile), 29 Dec 2015 @ 10:05am

    Re: Elsevier supports content mining, contra your salacious headline

    As I mentioned in the comment thread to the original blog post, the reason that we require miners to use our API is so that we can meet their needs AND ALSO the needs of our human users.

    Could you explain in what way the access described in this scenario (data transfer amounting to 35 KB / second, sustained over a week and a half) in any way serves to prevent you from meeting the needs of the human users?

    link to this | view in thread ]

  70. icon
    Bill Jackson (profile), 29 Dec 2015 @ 11:03am

    Re: Re: Re: Re: Snake Bit and going to die = Elsevier

    There are a number of ways Embassies, NSA typs organizations, political parties and organizations like Elsevier can use to create a uniquely coded pdf downloaded that links the subscriber's identity and the date of the download to an individual downloaded document. Bear in mind, all these documents will look superficially identical, same words, same images etc. A single line of text can probably encode 2-3 bits per 5 letter word by microkerning. This form of docu,ent control is used to trap leakers of data, as Elesvier desires. Afterwards the document can be scanned and the same software that created it can inspect the text spacings to identify who sent it. Steganography can also be used with photos.

    To combat this, documents need to be OCR recognised and all words re-word processed to standard kerning. Images can also be stripped of steganographic data via projection and re-photographing with a slightly different resolution.

    As to the precise ways used, it is hard to say, but if a number of different subscribers downloaded the same document at different locations as discrete subscribers that used the Elsevier API, which causes the system to create the uniquely coded document. With a few of these, they can be analyzed from the various methods used to create them, to see what means is used to encode them

    link to this | view in thread ]

  71. identicon
    Thomas Luedeke, 19 Jan 2016 @ 5:49pm

    Elesvier and anti-trust

    Why in the world hasn't an anti-trust lawsuit been brought against Elsevier? Or has it?

    Seems like a class-action lawsuit for anti-competitive would be a no-brainer, given the grotesqueness of their actions.

    link to this | view in thread ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.