Is It Really That Big A Deal That Twitter Blocked US Intelligence Agencies From Mining Public Tweets?
from the it's-public-info dept
Over the weekend, some news broke about how Twitter was blocking Dataminr, a (you guessed it) social media data mining firm, from providing its analytics of real-time tweets to US intelligence agencies. Dataminr -- which, everyone makes clear to state, has investments from both Twitter and the CIA's venture arm, In-Q-Tel -- has access to Twitter's famed "firehose" API of basically every public tweet. The company already has relationships with financial firms, big companies and other parts of the US government, including the Department of Homeland Security, which has been known to snoop around on Twitter for quite some time.Apparently, the details suggest, some (unnamed) intelligence agencies within the US government had signed up for a free pilot program, and it was as this program was ending that Twitter reminded Dataminr that part of the terms of their agreement in providing access to the firehose was that it not then be used for government surveillance. Twitter insists that this isn't a change, it's just it enforcing existing policies.
Many folks are cheering Twitter on in this move, and given the company's past actions, the stance is perhaps not that surprising. The company was one of the very first to challenge government attempts to get access to Twitter account info (well before the whole Snowden stuff happened). Also, some of the Snowden documents revealed that Twitter was alone among internet companies in refusing to sign up for the NSA's PRISM program, which made it easier for internet firms to supply the NSA with info in response to FISA Court orders. And, while most other big internet firms "settled" with the government over revealing government requests for information, Twitter has continued to fight on, pushing for the right to be much more specific about how often the government asks for what kinds of information. In other words, Twitter has a long and proud history of standing up to attempts to use its platform for surveillance purposes -- and it deserves kudos for its principled stance on these issues.
That said... I'm not really sure that blocking this particular usage really makes any sense. This is public information, rather than private information. And, yes, not everyone has access to "the firehose," so Twitter can put whatever restrictions it wants on usage of that firehose, but seeing as it's public information, it's likely that there are workarounds that others have (though, perhaps not quite as timely). But separately, reviewing public information actually doesn't seem like a bad idea for the intelligence community. Yes, we can all agree (and we've been among the most vocal in arguing this) that the intelligence agencies have a long and horrifying history of questionable datamining of other databases that they should not have access to. But publicly posted tweet information seems like a weird thing for anyone to be concerned about. There's no reasonable expectation of privacy in that information, and not because of some dumb "third party doctrine" concept, but because the individuals who tweet do, in fact, make a proactive decision to post that information publicly.
So, perhaps I'm missing something here (and I expect that some of you will explain what I'm missing in the comments), but I don't see why it's such a problem for intelligence agencies to do datamining on public tweets. We can argue that the intelligence community has abused its datamining capabilities in the past, and that's true, but that's generally over private info where the concern is raised. I'm not sure that it's helpful to argue that the intelligence community shouldn't even be allowed to scan publicly available information as well. It feels like it's just "anti-intelligence" rather than "anti-abusive intelligence."
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: data mining, intelligence, intelligence community, public info, surveillance, tweets
Companies: dataminr, twitter
Reader Comments
Subscribe: RSS
View by: Time | Thread
I'm in agreement. Public information should be available for the commons, which means not dictating limits on its use. Those uncomfortable with this may want to rethink their use of public sites or more private ones like Facebook where power is ceded to the hosting entity.
[ link to this | view in chronology ]
You really don't know what the scope is until you know what the scope is.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
If not, what is the difference between data-mining "the firehose" and building ALPR record databases?
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: ALPR databases
Thus, my stance is that while collecting an ALPR database is legal, many of those who do it are maintaining it in ways that I would like to see changed. License plate recognition and tracking, like some other forms of surveillance, was historically limited by the impracticality of personally conducting it on a large scale. No one would seriously expect the police to spend the manpower to build a license plate database by having officers stand at major intersections and enter all those plates by hand, so no one thought to pass a law limiting such surveillance to contexts with an articulable connection to an open investigation. Now they can very cheaply collect such information via ALPRs, and there is still no law telling them not to do it, so they do because it looks neat and might help some day. I want a law that, at minimum, clearly disallows surveillance that has no better basis than "it might be useful some day."
[ link to this | view in chronology ]
Re: Re: Re: ALPR databases
Given that the hardware will get smaller and cheaper (and storage space bigger and cheaper), it's possible that ALPRs could become ubiquitous: slap tiny little button cameras on any & every street sign, mile marker, and public structure available. Plus, since the devices are leased/sold by private companies that also host the databases that retain captured info, there's nothing to stop mounting them on every bit of private property owned by someone who'll do it for a little compensation.
As the density of ALPRs increases, the resolution of the monitored path traveled by an individual car increases. Eventually, this resolution is effectively GPS tracking of every car that leaves its driveway. The metaphor here goes beyond a cop with a camera on every street corner: it becomes a cop car with a running dash cam following every single person 24/7. We'll all have to switch to self-driving cars, if only to avoid all the traffic fines we'd have to pay (or we'll have to insert credit cards into manually-driven cars to pay the near-real-time penalties as they roll in). Then there's the ol' problem of "this car stops on a street known for activity x every Saturday; maybe we should pull it over for 'failure to signal' this coming weekend." (Thinking that that sort of thing mightn't be legal, the words 'parallel' and 'construction' spring to mind for some reason.)
Even ignoring LEA (warrant-less) access to this sort of information, the private sector control of the data is more than a bit creepy. Imagine health insurance companies being able to ask how often someone stops at McDonalds or Krispy Kreme, auto insurers with black-box access to your driving habits, your boss being able to check if you spend a lot of time at the local watering-hole after work every day, etc. etc.
Privatizing mass surveillance doesn't quite feel like an egalitarian utopia with lots of togas & flyin' cars. Feels more like one of those cases where corporations being nothing more than everyday people (just like you & me!) with everyday rights isn't exactly 'leveling the playing field'.
[ link to this | view in chronology ]
Re: Re: Re: Re: ALPR databases
[ link to this | view in chronology ]
Re: Re: Re: Re: ALPR databases
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
In the aggregate, there is a big difference. Twitter posts track what the user chooses to say publicly. The user has total control over that.
But ... aggregated ALPR data can track a person's movements and locations, giving a picture of where and whom one deals with. This implicates privacy concerns (urologist office Wed 2:30; married coworker's house at lunch), freedom of association concerns (stop by local KKK HQ), including false positives (cousin borrows your car for a murder), and so on. Big difference.
[ link to this | view in chronology ]
Re: Re:
And a driver has total control over where they choose to drive publicly, and even which route they choose to take to get there. What's the difference?
[ link to this | view in chronology ]
Re: Re: Re:
Tracking someone in public because you have a good reason to do so (probable cause) is morally very different from tracking everyone in public just because you can do it cheaply and you might find something interesting.
[ link to this | view in chronology ]
Re: Re: Re:
Oh, I dunno ... say you have finally gotten that appointment with that HIV specialist and have no other way to get there but your own car.
Or say you are active in a protest group that the government would like to take you down for, Constitution be damned. Lots of stories out there about The Man surveilling perfectly legal activism. And say a FBI agent asks you about it and you waffle, and he doesn't like it, so he says you lied to a federal agent. Then you can go to jail over for lying to a federal agent, just because. And you can't prove otherwise because it's his word/notes against yours as they don't record interviews... that sort of thing.
[ link to this | view in chronology ]
Re: Re: Re: Re:
in general, OUR lives should be totally opaque to gummint, and gummint should be totally transparent to us, the theoretical owners and bosses of 'our' (sic) gummint...
[ link to this | view in chronology ]
Re: Re: Re: Re:
Why say that? Show me a scenario in which public transportation and cabs are unavailable, and yet normal businesses are open for business and the roads are clear for normal driving (ie. not a natural disaster situation) and I will show you a contrived scenario that has no place in a discussion of real-life events.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
"...First Amendment implications as well."
[ link to this | view in chronology ]
Re:
ALPR is taking public license information that is in many places mandated by government, and capturing its movement in bulk.
Tweets are fully voluntary broadcasts.
And that's the difference.
[ link to this | view in chronology ]
Re:
Others have already pointed out the choice factor. You actively choose to spread your words on Twitter. You cannot choose to hide your license plate when driving your car, hence forcibly "communicating" your movements to anyone looking at your car.
I would add that there is also a difference in the level of publicity. When you type a tweet, you know it will be available to everyone, and indexed for easy search. When driving your car, another individual would have to follow you around to know where you've gone. ALPR is taking this to a level that is 1. not originally expected and 2. on a level that a single individual cannot match.
As I see it, those two points make ALPRs a completely different issue.
[ link to this | view in chronology ]
Anti-abusive intelligence
[ link to this | view in chronology ]
Piping Data has a cost
[ link to this | view in chronology ]
Posturing for PR Purposes
Part of this may be PR as a company trying to attract job talent ... and getting responses like mine:
A couple/three years ago I interviewed with Dataminr. I ended the interview process after I learned that their clients included various agencies, because I just didn't want to be a part of a company that did analysis for "those people."
The tech industry as a whole has tended to lean towards privacy over surveillance, and Dataminr is likely losing out on a lot of talent because they're perceived as being in the pocket of those agencies.
As a data scientist (see that new buzzword?) I'm interested in their volume of data, and the science behind processing that data. However, until they cease to be a tool of oppression, they'll not be attractive to me. And I don't think I'm alone in making this assessment.
[ link to this | view in chronology ]
Re: Posturing for PR Purposes
[ link to this | view in chronology ]
Re: Posturing for PR Purposes
You obviously haven't paid any attention to pretty much anything over the past decade or so if you believe that. Most of silicon valley has been built around the for-profit surveillance model.
[ link to this | view in chronology ]
Re: Re: Posturing for PR Purposes
It's a level of hypocrisy that borders on stunning.
[ link to this | view in chronology ]
Anyone else trying to do the same would be flagged as abusive by servers.
Spook agencies already slurp data their own way. Maybe they should actually be using data gained by their methods rather than every possible convenience of access ever.
You know they are mostly bothering people who don't do anything other than have an opinion, and only certain sorts, at that. They behave like everyone else who is "vigilant" about "terrorism", freaking out over someone doing algebra, but go ahead and publicly demonstrate for white, christian, right-wing insurrection and that is hunky-dory. Gee, make a lifelong career of it. (Whether the run files on them or not, they aren't the ones who mysteriously end up on no-fly lists, etc.)
More broadly, no on - not agencies or marketing arms of companies, properly minimize or anonymize and protect any data. Twitter firehose would be neither, by nature. And a very small percentage of the use of this sort of mass data is anything positive. And it could be. So why give access to such data to such poor stewards.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
It's my hose
[ link to this | view in chronology ]
Snowden 60 minutes interview
[ link to this | view in chronology ]
Re: Snowden 60 minutes interview
[ link to this | view in chronology ]
Crying Wolf
The Intelligence community and the Justice Department have abused every piece of good will that the Silicon Valley has extended them. Why on earth should Twitter trust them not to attempt to spin any sort of access into a story of Twitter being picky about what it shares and what it doesn't?
[ link to this | view in chronology ]
Re: Crying Wolf
'They were willing to give us X, but now they're refusing to give us Y because they care more about their bottom line than helping us catch criminals/terrorists/communists.'
I think David T. Macknet above is probably partially right, this is at least part PR move, but I imagine part of it is simply to reduce the government's ability to use Twitter's willingness to work with them against them down the road.
[ link to this | view in chronology ]
False dichotomy though....
1) There's mountains of difference between ten dudes tweeting and what an entire country is tweeting "in aggregate". Or everything those ten dudes have ever tweeted since sign up / the age of consent.
2) Microbloggers living under oppressive governments are on an online hair-trigger. Ever make a copy-paste mistake? Accidentally typed your password in the wrong text field? Regret something u wrote online? Operated a phone when a bit drunk? Heard of someone famous deleting a tweet?
3) Social networks have utterly complicating Public/Private dualism. With every node in the network, and every bloop arriving from your Contacts List, comes 50,000 shades of grey.
Do we really want to CC the government on every digital message ever sent that didn't require a password to read?
Will it be to hell with broadcasters? Encrypt it or die?
[ link to this | view in chronology ]
the problem is not the agencies mining the public data. The nuance is not providing them an easy way of doing it.
Basically if they want it, they can work for it instead of it being handed to them by the company.
[ link to this | view in chronology ]
[ link to this | view in chronology ]