Court Says CFAA Isn't Meant To Prevent Access To Public Data, Orders LinkedIn To Drop Anti-Scraper Efforts

from the perverting-a-bad-law dept

Tue, Aug 15th 2017 11:55am — Tim Cushing

Some good pushback against the CFAA (Computer Fraud and Abuse Act) has been handed down by a federal court. LinkedIn, which has frequently sued scrapers under both the CFAA and DMCA, just lost an important preliminary round to a company whose entire business model relies on LinkedIn's publicly-available data.

hiQ Labs scrapes LinkedIn data from users whose accounts are public, repackages it and sells it to third party recruiters and HR departments, allowing companies to track employee skills and get a read on which employees might be planning to jump ship.

LinkedIn didn't care much for another business piggybacking on its data (and likely cutting back ever so slightly on the number of third parties it sells this data to), so it sued hiQ, alleging the scraping of publicly-available data violated the CFAA. This has completely backfired. hiQ has obtained an injunction preventing LinkedIn from blocking its scraping efforts. [h/t Brad Heath]

In short, the court finds the hardships are all on hiQ's side: if LinkedIn blocks the scraping, the company will likely close. The decision [PDF], importantly, notes this isn't what the CFAA was put in place to guard against. It also adds that if it sided with LinkedIn's arguments, the internet itself would suffer.

In summary, the balance of hardships tips sharply in hiQ's favor. hiQ has demonstrated there are serious questions on the merits. In particular, the Court is doubtful that the Computer Fraud and Abuse Act may be invoked by LinkedIn to punish hiQ for accessing publicly available data; the broad interpretation of the CFAA advocated by LinkedIn, if adopted, could profoundly impact open access to the Internet, a result that Congress could not have intended when it enacted the CFAA over three decades ago.

And there's more bad news for LinkedIn:

Furthermore, hiQ has raised serious questions as to whether LinkedIn, in blocking hiQ's access to public data, possibly as a means of limiting competition, violates state law.

LinkedIn tried to argue continued access by hiQ would threaten its own business, mainly through supposed violations of its customers' privacy. It notes many of its users (50 million to be exact) have deployed LinkedIn's "Do Not Broadcast" option, which limits notifications about changes to accounts. Out of the 50 million users, LinkedIn claims three have alleged harm from third-party data collection. LinkedIn says hiQ's scraped determinations about poachable employees could harm users whose accounts remain public, but are utilizing the "Do Not Broadcast" feature.

The court is not entirely unsympathetic to LinkedIn's arguments. But it is mostly unsympathetic, partially because LinkedIn appears to be vastly overstating the privacy concerns of its users...

These considerations are not without merit, but there are a number of reasons to discount to some extent the harm claimed by LinkedIn. First, LinkedIn emphasizes that the fact that 50 million users have opted into the "Do Not Broadcast" feature indicates that a vast number of its users are fearful that their employer may monitor their accounts for possible changes. But there are other potential reasons why a user may opt for that setting. For instance, users may be cognizant that their profile changes are generating a large volume of unwanted notifications broadcasted to their connections on the site. They may wish to limit annoying intrusions into their contacts.

Second, LinkedIn has presented little evidence of users' actual privacy expectation; out of its hundreds of millions of users, including 50 million using Do Not Broadcast, LinkedIn has only identified three individual complaints specifically raising concerns about data privacy related to third-party data collection. Docket No. 49-1 Exs. A-C. None actually discuss hiQ or the "Do Not Broadcast" setting.

...and partially because LinkedIn doesn't appear to care all that much about its users' privacy.

Third, LinkedIn's professed privacy concerns are somewhat undermined by the fact that LinkedIn allows other third-parties to access user data without its members' knowledge or consent. LinkedIn offers a product called "Recruiter" that allows professional recruiters to identify possible candidates for other job opportunities. LinkedIn avers that when users have selected the Do Not Broadcast option, the Recruiter product respects this choice and does not update recruiters of profile changes. However, hiQ presented marketing materials at the hearing which indicate that regardless of other privacy settings, information including profile changes are conveyed to third parties who subscribe to Recruiter. Indeed, these materials inform potential customers that when they "follow" another user, "[f]rom now on, when they update their profile or celebrate a work anniversary, you'll receive an update on your homepage. And don't worry – they don't know you're following them." LinkedIn thus trumpets its own product in a way that seems to afford little deference to the very privacy concerns it professes to be protecting in this case.

As for the alleged CFAA violations, the court find nothing that agrees with LinkedIn's legal theory public information anyone can access somehow turns into unauthorized access when a company accesses it via a scraper.

A user does not "access" a computer "without authorization" by using bots, even in the face of technical countermeasures, when the data it accesses is otherwise open to the public.

But it goes further, laying down in explicit detail how ruling in LinkedIn's favor would severely damage open access on the internet.

Under LinkedIn's interpretation of the CFAA, a website would be free to revoke "authorization" with respect to any person, at any time, for any reason, and invoke the CFAA for enforcement, potentially subjecting an Internet user to criminal, as well as civil, liability. Indeed, because the Ninth Circuit has specifically rejected the argument that "the CFAA only criminalizes access where the party circumvents a technological access barrier," Nosal II, 844 F.3d at 1038, merely viewing a website in contravention of a unilateral directive from a private entity would be a crime, effectuating the digital equivalence of Medusa. The potential for such exercise of power over access to publicly viewable information by a private entity weaponized by the potential of criminal sanctions is deeply concerning...

[T]he CFAA as interpreted by LinkedIn would not leave any room for the consideration of either a website owner's reasons for denying authorization or an individual's possible justification for ignoring such a denial. Website owners could, for example, block access by individuals or groups on the basis of race or gender discrimination. Political campaigns could block selected news media, or supporters of rival candidates, from accessing their websites. Companies could prevent competitors or consumer groups from visiting their websites to learn about their products or analyze pricing. Further, in addition to criminalizing any attempt to obtain access to information otherwise viewable by the public at large, the CFAA would preempt all state and local laws that might otherwise afford a legal right of access (e.g., state law rights asserted by hiQ herein). A broad reading of the CFAA could stifle the dynamic evolution and incremental development of state and local laws addressing the delicate balance between open access to information and privacy – all in the name of a federal statute enacted in 1984 before the advent of the World Wide Web.

The case will still proceed forward, but the outlook isn't that bright for LinkedIn. It has been ordered to drop any anti-circumvention efforts it put in place within 24 hours and rescind the cease-and-desist orders it sent to hiQ. On top of there being zero chance it will prevail on its CFAA claims, the company will now have to defend itself against state law counterclaims by hiQ. This legal effort -- probably deployed in hopes of achieving a quick settlement -- is going to add up to real dollars in legal fees alone.

Filed Under: cfaa, public data, scraping
Companies: hiq, linkedin

28 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

SirWired (profile), 15 Aug 2017 @ 12:12pm

Yeah, no surprises here
This makes it plainly obvious that LinkedIn didn't actually care that it's user's information was being sold, they were just upset that they weren't the ones doing the selling...

Just, again, proves the adage that with "free" services, you are the product.
[ link to this | view in thread ]
Mason Wheeler (profile), 15 Aug 2017 @ 12:56pm

And don't worry – they don't know you're following them.

There's a word for that kind of "following:" we call it stalking, and in many other contexts, it's illegal.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 1:02pm

Headline overstates as usual: it's JUST an injunction.
My business model, inspired by your frequent advocacy for stripping "public data", is to A) continously monitor Techdirt for all changes (even on ancient pages, be interesting since free to me), and B) keep stats on comments of those registered, to C) sell web page access to persons interested, for speedy notification of new posts / comments (outside of TD's system, of course, for the dissent).

Now, I'm no network engineer, demonologist, gastropod, or mathemagician, but let's do some ball park figgers:

50000 pages * 200000 bytes each (probably optimistic) = 10,000,000,000.

I've actually tested and looks like can get pages in 3 seconds, so:

(50000 * 3) / 3600 = 41.667 hours per loop, or 4 complete scrapes / week.

Then 10G * 4 per week * 4 weeks = 160G / month.

Calcs are just for article pages, doesn't include monitoring each account, which can run in parallel. And of course there'll be focus on newest pages, so add MANY as possible of those TOO. -- Are you okay with paying for a little extra bandwidth? Bytes and speed may be much higher in practice: I'll have to find how many requests can go in parallel. With your well-known insouciance for cost of bandwidth, I'll just take silence as yes and begin scraping tomorrow, or even tonight, it's a trivial "script" to write. Thanks.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 1:26pm

Re: Headline overstates as usual: it's JUST an injunction.
You're assuming you'd scrap every single page of the website, over and over again, continuely? No one does that.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 1:27pm

Re: Headline overstates as usual: it's JUST an injunction.
It's probably a fair bit faster than 3 seconds per page since a crawler doesn't have to actually render the page.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 1:27pm

Re: Headline overstates as usual: it's JUST an injunction.
So, to answer you, "no, you are definitely no engineer."
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 1:40pm

Re: Headline overstates as usual: it's JUST an injunction.
If the scraping was a problem for LinkedIn whether it be for bandwidth, server farm capacity, etc., why didn't they allege that in their complaint?
[ link to this | view in thread ]
zeiche (profile), 15 Aug 2017 @ 2:27pm

Why an Injunction?
Using CFAA to prevent scraping seems extreme, however I’m confused about the injunction. Why can’t LinkedIn use technical measures to block scraping? Plenty of sites prevent bots from access. That is the whole point of reCaptcha.
[ link to this | view in thread ]
Thad, 15 Aug 2017 @ 2:59pm

Re: Why an Injunction?

Why can’t LinkedIn use technical measures to block scraping? Plenty of sites prevent bots from access. That is the whole point of reCaptcha.

No, that's the point of .htaccess files. The point of ReCaptcha is to prevent bots from using forms.
[ link to this | view in thread ]
Thad, 15 Aug 2017 @ 3:00pm

Re: Re: Why an Injunction?
I meant robots.txt. (Though .htaccess is for controlling traffic as well.)
[ link to this | view in thread ]
Matthew Cline (profile), 15 Aug 2017 @ 3:27pm

Re: Re: Re: Why an Injunction?
I don't think robots.txt has any legal weight to it. Anyone wanting to do scraping could just ignore it.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 3:46pm

What I want to know is how can the court order a website or service to stop anti circumvention efforts? For my site, I wouldn't given a rat's ass what the court or a judge ordered, I simply would not comply because it's an unjust order.

I had problems with my site where members from other sites would register on my site and try to get my community to ditch my site for theirs. Not only did I ban their accounts but I also banned their IP addresses, email addresses and blocked their ability to access my site. I'm able to block them from not just my site's forum community software but also through my site's administration tools.
[ link to this | view in thread ]
Thad, 15 Aug 2017 @ 3:57pm

Re: Re: Re: Re: Why an Injunction?
Indeed.

I'd have said "maybe CFAA", but per this ruling, nope.

Which is a good result. There are a lot of reasons why robots.txt shouldn't be legally enforceable.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 4:04pm

This detestable company was sued because they went through members'emails and contacts and sent out invites to their members' contacts statibg they had been invited by their members to sign up with LinkedIn. and join in on discussion when NO SUCH INVITE HAD OCCURED, LOST THE LAWSIIT AND CONTINUED THE SAME SHITTY PRACTICE.
[ link to this | view in thread ]
Mike Masnick (profile), 15 Aug 2017 @ 5:09pm

Re: Why an Injunction?
Using CFAA to prevent scraping seems extreme, however I’m confused about the injunction. Why can’t LinkedIn use technical measures to block scraping? Plenty of sites prevent bots from access. That is the whole point of reCaptcha.

Yeah, that's the part that confuses me about this as well. I think HiQ should be able to scrape without legal concern and I think Linkedin should be free to try to block with technical measures, and HiQ should be free to adjust and respond. But... I'm not sure about a law demanding that Linkedin let someone scrape.
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 5:19pm

Re:
https://en.wikipedia.org/wiki/Contempt_of_court
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 5:23pm

Should be Finkedin?
[ link to this | view in thread ]
Anonymous Coward, 15 Aug 2017 @ 5:23pm

Re: Headline overstates as usual: it's JUST an injunction.
You silence = chicken fucker.
[ link to this | view in thread ]
PaulT (profile), 16 Aug 2017 @ 12:38am

Re: Yeah, no surprises here
I've said on previous articles on this subject - the problem wasn't what hiQ were doing, the problem was that LinkedIn wants it both ways. They want all the benefits of information being public, but also wanted control as if the information were private. They just can't have it both ways.

This is the problem. LinkedIn have very easy tools to stop hiQ from being able to access information and ways to permanently ban them from legally accessing data. They just don't want to give up the extra traffic and other benefits that come form public accessibility. Hopefully the courts will find what I think is the correct outcome - LinkedIn are told to choose between public data and control. They can't have both.
[ link to this | view in thread ]
PaulT (profile), 16 Aug 2017 @ 12:42am

Re: Headline overstates as usual: it's JUST an injunction.
"figgers"

Really?

"Are you okay with paying for a little extra bandwidth?"

I can't speak for Techdirt, but to speak for myself if your idiotic overblown scenario is to happen:

Yes, if the benefits of having the data publicly accessible outweigh the risk of these costs. If those costs become too burdensome, I will take steps to stop you from accessing the data. I won't be running to the courts whining that the public are accessing the things I put in public.

"it's a trivial "script" to write"

Lol go ahead. If your coding is anything like your English and maths skills, this site will be perfectly safe for the time being.
[ link to this | view in thread ]
PaulT (profile), 16 Aug 2017 @ 12:47am

Re: Why an Injunction?
"Why can’t LinkedIn use technical measures to block scraping?"

They can, but won't. I believe the whole point is that they don't just want to stop hiQ in this specific instance, they want a legal precedent to get any competitor using their data shut down, including those that haven't based their entire business on scraping like hiQ seem to have done. They want to claim complete control over everything they have published, including that clearly in the public realm.

In order to do this, they have to pretend the damage is a great as possible, which means not utilising any technical barrier available to them.
[ link to this | view in thread ]
Anonymous Coward, 16 Aug 2017 @ 1:40am

Re:
StinkedIn
[ link to this | view in thread ]
Anonymous Coward, 16 Aug 2017 @ 2:04am

Re: Why an Injunction?
> Why can’t LinkedIn use technical measures to block scraping.

The judge both raised that exact point and deferred on it when making this ruling.
Stated that as long as it isn't password protected it is public. Might still rule in favor of LinkedIn for technical measures like IP blocking.
[ link to this | view in thread ]
Anonymous Coward, 16 Aug 2017 @ 2:07am

This is not over by a long shot.
There is a similar ruling in favor in a case brought by Facebook. Main difference was that that was scraping password protected content. So on appeal this still has a chance of being reversed.
[ link to this | view in thread ]
jimbo, 16 Aug 2017 @ 6:52am

CFAA and Aaron Swartz
It's a pity This attitude by the courts wasn't being expressed when Aaron Swartz was being hounded to the point of suicide.
[ link to this | view in thread ]
JoeCool (profile), 16 Aug 2017 @ 9:01am

Got what they wanted... sort of

This legal effort -- probably deployed in hopes of achieving a quick settlement -- is going to add up to real dollars in legal fees alone.

They'll probably get their quick settlement, but at this point, it'll probably be the opposite way they thought in the beginning.
[ link to this | view in thread ]
Mason Wheeler (profile), 16 Aug 2017 @ 10:25am

Re: Re: Why an Injunction?
The court addressed this: if you can keep people who might be competitors from scraping your site, this is, by definition, anticompetitive behavior, which is not legal. It's not "a law demanding that Linkedin let someone scrape" so much as applying existing laws to this circumstance. And IMO the court's right about that. You can't make data publicly available and then try and put restrictions on its access or use.
[ link to this | view in thread ]
Bergman (profile), 16 Aug 2017 @ 12:31pm

Re: Why an Injunction?
I believe the reasoning was that offering public information, blocking scraping of the information, while simultaneously offering a commercial product that bundles that information for sale would give a third party a decent chance of prevailing in an anti-competition/anti-trust act lawsuit.
[ link to this | view in thread ]