The Story Behind Facebook Threatening To Sue Developer Into Oblivion For Highlighting Useful Facebook Data
from the how-nice-of-them dept
Facebook's lawyers have been getting pretty nasty lately. We recently covered the company's threats against the creator of a useful Greasemonkey script, and now a developer named Pete Warden has shared the sordid details of his legal run-in with Facebook -- where they threatened to sue him for his activity aggregating publicly available data found on Facebook.You should read the full story, but basically, he built a simple crawler for public Facebook info, initially for his own purposes. He made sure that Facebook's robots.txt didn't block such crawlers -- and he also emailed someone at Facebook (who he had dealt with before), but didn't hear back from anyone. As his crawler worked, it started collecting a bunch of interesting data, and so he set up a website to let people explore some of this (again, public) data.
After playing with some of the data himself, he started making some interesting maps and charts with the data, and did a simple analysis of geographic locations of Facebook friend connections to show people what you could do with the data. He noted that if others (such as professional researchers) wanted to dig into the data, he would let them access a version of the data set (with identifying info stripped). The chart he released got picked up by a variety of sites and quickly got passed around.
And that's when the lawyers called:
On Sunday around 25,000 people read the article, via YCombinator and Reddit. After that a whole bunch of mainstream news sites picked it up, and over 150,000 people visited it on Monday. On Tuesday I was hanging out with my friends at Gnip trying to make sense of it all when my cell phone rang. It was Facebook's attorney.Mathew Ingram reported on the data getting forced down, and got a statement from Facebook that seems to miss the point:
He was with the head of their security team, who I knew slightly because I'd reported several security holes to Facebook over the years. The attorney said that they were just about to sue me into oblivion, but in light of my previous good relationship with their security team, they'd give me one chance to stop the process. They asked and received a verbal assurance from me that I wouldn't publish the data, and sent me on a letter to sign confirming that. Their contention was robots.txt had no legal force and they could sue anyone for accessing their site even if they scrupulously obeyed the instructions it contained. The only legal way to access any web site with a crawler was to obtain prior written permission.
Andrew Noyes, manager of public policy communications at Facebook, said in an email that Warden "aggregated a large amount of data from over 200 million users without our permission, in violation of our terms. He also publicly stated he intended to make that raw data freely available to others." Noyes also noted that Facebook's statement of rights and responsibilites says that users agree not to collect users' content or information "using automated means (such as harvesting bots, robots, spiders, or scrapers) without our permission."But I still don't see what the legal argument is. At best, I could see them terminating his account for disobeying the terms of service -- but even then the whole thing doesn't make much sense. The data is publicly available and, as Peter notes, it's pretty much standard practice for people to aggregate and analyze such data. However, he also pointed out that he couldn't afford to be a legal test case, and so he gave in and negotiated with Facebook to remove the data.
In the end, though, this shows Facebook's rather schizophrenic view towards data and privacy. On the one hand, it tries to push everyone to open up their info, but then if anyone does anything useful with it, they threaten to sue?
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: crawler, facebook, legal threats, public information
Reader Comments
Subscribe: RSS
View by: Time | Thread
[ link to this | view in thread ]
since forever
[ link to this | view in thread ]
Does data that requires you to login constitute "public" data though? Where's the threshold?
[ link to this | view in thread ]
Re:
[ link to this | view in thread ]
Re:
in fact everyone should log out of facebook and google their names to make sure they know what is posted publicly.
[ link to this | view in thread ]
Re:
[ link to this | view in thread ]
[ link to this | view in thread ]
I know logic isn't involved, but...
And by "could", I mean "could have". By which I mean that once he made the announcement, it'd be hard to prove that the data in his possession hadn't come from big crowds of helpful Facebookers.
[ link to this | view in thread ]
I saw this yesterday and it really irked me
[ link to this | view in thread ]
Re: I know logic isn't involved, but...
One problem, with today's standard you are guilty until you prove YOU DID NOT get it by other means.
[ link to this | view in thread ]
All about control...
Personally I don't believe they have a legal leg to stand on, but our court system is for the rich as the rest of us can't afford to fight.
[ link to this | view in thread ]
[ link to this | view in thread ]
The reason is very simple if you ask yourself a simple question, how does facebook make money?
Via two methods. The obvious one is advertising, the second, not so obvious method is selling info like what was collected by this guy. He was cutting into their revenue stream, hence the trigger happy (but imo toothless) lawyers
[ link to this | view in thread ]
This sounds like something for the EFF to handle.
[ link to this | view in thread ]
Re: Re:
[ link to this | view in thread ]
Tresspass to Chattels
[ link to this | view in thread ]
[ link to this | view in thread ]
Re:
[ link to this | view in thread ]
Re: Re:
[ link to this | view in thread ]
Re: Re:
[ link to this | view in thread ]
Why don't they...
[ link to this | view in thread ]
Facebook data leak - download all files here
So I have all the original Facebook data, decomressed them, and tested three Windows-based compressors - WinRAR won out (the other contestants were 7-Zip and WinZIP)
The original data are merely huge text files, and came in at a hefty 15GB. With WinRAR, I was able to get that to just a bit over 2GB.
If you would like the files, you can download them yourselves from RS, much faster I suspect than from a torrent. Here are the links:
http://rapidshare.com/files/409949014/Facebook.repacked.part01.rar
http://rapidshare.com/ files/409947525/Facebook.repacked.part02.rar
http://rapidshare.com/files/409947812/Facebook.repacke d.part03.rar
http://rapidshare.com/files/409997211/Facebook.repacked.part04.rar
http://rapidshare. com/files/409997597/Facebook.repacked.part05.rar
[b]YOU MUST DOWNLOAD all five files to get the data.[/b] Click FREE USER button if not a Premium Member.
It's ALL public information, so is all legal - kinda fun to peruse through, though not exciting.
Enjoy.
[ link to this | view in thread ]