LinkedIn Appeals Important CFAA Ruling Regarding Scraping Public Info Just As Concerns Raised About Clearview

from the this-could-get-interesting dept

Last fall we were happy to see the 9th Circuit rule against LinkedIn in its CFAA case against HiQ. If you don't recall, the CFAA is the "anti-hacking" law that has been widely abused over the years to try to shut down perfectly reasonable activity. At issue is whether "scraping" information violates a terms of service, and thus, the CFAA. A few years back, the same court ruled in favor of Facebook against Power Ventures, saying that even though Power's users gave permission to Power and handed over their login credentials, Power was violating the CFAA in scraping Facebook, because the information was behind a registration wall -- and because Facebook had sent a cease-and-desist.

In the HiQ case, despite what seemed to be a similar fact pattern, the court ruled against LinkedIn, saying it could not block HiQ's scraping via a CFAA claim, with the main "difference" being that LinkedIn information was publicly viewable, and therefore should be open to scraping. I still don't quite see the difference between the cases -- because in the Facebook situation, once you have a login, the information is effectively available in the same manner, but that is how the courts ruled. After first asking (and not getting) an en banc review (and then asking for more time), LinkedIn has now asked the Supreme Court to weigh in on this issue (hat tip to Media Post). I worry that the court might make things much worse if it does take the case, and block all kinds of scraping.

Of course, one thing that's notable since the 9th Circuit ruling came down -- all of the attention that Clearview AI has received over the last few months, for its frightening facial recognition app, built of of scraping "public" social media images and profiles. This use of scraping has convinced some -- even some who seemed to support the HiQ ruling -- that perhaps there should be limits on scraping. I think that's a kneejerk reaction, and focusing in too narrowly on the wrong issue. The issue there is not with scraping, but with the specific use of the data as an attack on privacy going well beyond the internet itself (i.e., tracking and identifying people out in the real world). It's one thing to focus on that issue, as opposed to saying that's an argument against free scraping.

At a time when we're so worried about competition, the ability to scrape is incredibly important. It's how competitors can be built in a world with network effects. If other companies can build compatible services, without having to do a deal with Facebook or Linkedin or YouTube or Twitter, that enables more competition much more easily. And yet, too many efforts are being made to cut off that kind of interoperability. The LinkedIn case is just one example. If the Supreme Court does take it up, let's hope they recognize just how important this kind of adversarial interoperability can be, rather than buying into some nonsense about how scraping must be blocked and not allowed.

As for the petition itself, the question LinkedIn is asking the Court to review is whether or not bots can scrape websites, even after receiving a cease-and-desist letter:

Whether a company that deploys anonymous computer “bots” to circumvent technical barriers and harvest millions of individuals’ personal data from computer servers that host public-facing websites—even after the computer servers’ owner has expressly denied permission to access the data—“intentionally accesses a computer without authorization” in violation of the Computer Fraud and Abuse Act.

While I can understand the Clearview-like horror stories some may put forth about this activity, to allow companies to block all scraping like this would create huge problems for both a functioning internet (hello search...) as well as competition.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: cfaa, interoperability, scotus, scraping, supreme court
Companies: hiq, linkedin


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Anon, 13 Mar 2020 @ 6:49am

    Scraping public info with login?

    Isn't this what Aaron Schwarz was threatened with 35 years in jail for doing?

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 13 Mar 2020 @ 7:32am

      Re: Scraping public info with login?

      I don't think Swartz had a login. When using the MIT network (as with those of other major universities), some sites will allow access without a login or paywall.

      link to this | view in chronology ]

  • identicon
    stine, 13 Mar 2020 @ 8:13am

    you should have at least mentioned

    robots.txt

    That's what this file is for.

    link to this | view in chronology ]

    • icon
      Toom1275 (profile), 13 Mar 2020 @ 9:29am

      Re: you should have at least mentioned

      Only if the scraper respects it.

      link to this | view in chronology ]

    • icon
      Mike Masnick (profile), 13 Mar 2020 @ 11:00am

      Re: you should have at least mentioned

      Yeah, that's not directly related to this issue. But if LinkedIn wins, then robots.txt now becomes legally enforceable, rather than voluntary, and that's NOT a good thing.

      link to this | view in chronology ]

  • identicon
    Professor Ronny, 13 Mar 2020 @ 2:20pm

    I worry that the court might make >things much worse if it does take
    the case, and block all kinds of
    scraping.

    As I see it, scraping is taking my stuff off the internet without my permission. Stopping that is a good thing, not a bad thing.

    link to this | view in chronology ]

    • icon
      Tanner Andrews (profile), 14 Mar 2020 @ 6:00am

      Re:

      As I see it, scraping is taking my stuff off the internet without my permission

      Not a good classification. Viewing your publicly available web page isalso taking your stuff off the internet without your permission.

      I may be viewing it immediately rather than at a later time as I review robot scrapings, but I would not want to be the person who must make a workable distinction. There is, after all, some delay involved in receiving and rendering the data. There may be proxies and routers involved, which never actually view the data.

      To add complexity, consider caching web browers. Some web browsers will try not to re-fetch the same information because they keep it in local store for potential re-use.

      You will need to make this distinction, and find a way to communicate it, if we are to give weight to your "permission".

      link to this | view in chronology ]

    • icon
      Mike Masnick (profile), 15 Mar 2020 @ 9:01pm

      Re:

      As I see it, scraping is taking my stuff off the internet without my permission. Stopping that is a good thing, not a bad thing.

      You see it wrong. Under your definition, anyone browsing the web is "taking your stuff off the internet" because it downloads to their computer. That's not how it works.

      Scraping enables all sorts of important web services, including search.

      link to this | view in chronology ]

      • icon
        Tanner Andrews (profile), 3 May 2020 @ 12:35pm

        Re: Re:

        Scraping enables all sorts of important web services, including search

        So we need some definition for ``scraping'' that will not sweep in routers, caching browsers, web caches, and web browsers with screen readers.

        link to this | view in chronology ]

  • identicon
    Anonymous Coward, 13 Mar 2020 @ 8:47pm

    LINKEDIN was sued for sending people notices their friends had signed them up for LINKEDIN when they had not. They scraped all those email contacts from people.

    FUCK LINKEDIN

    link to this | view in chronology ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.