Is It Really That Big A Deal That Twitter Blocked US Intelligence Agencies From Mining Public Tweets?

from the it's-public-info dept

Over the weekend, some news broke about how Twitter was blocking Dataminr, a (you guessed it) social media data mining firm, from providing its analytics of real-time tweets to US intelligence agencies. Dataminr -- which, everyone makes clear to state, has investments from both Twitter and the CIA's venture arm, In-Q-Tel -- has access to Twitter's famed "firehose" API of basically every public tweet. The company already has relationships with financial firms, big companies and other parts of the US government, including the Department of Homeland Security, which has been known to snoop around on Twitter for quite some time.

Apparently, the details suggest, some (unnamed) intelligence agencies within the US government had signed up for a free pilot program, and it was as this program was ending that Twitter reminded Dataminr that part of the terms of their agreement in providing access to the firehose was that it not then be used for government surveillance. Twitter insists that this isn't a change, it's just it enforcing existing policies.

Many folks are cheering Twitter on in this move, and given the company's past actions, the stance is perhaps not that surprising. The company was one of the very first to challenge government attempts to get access to Twitter account info (well before the whole Snowden stuff happened). Also, some of the Snowden documents revealed that Twitter was alone among internet companies in refusing to sign up for the NSA's PRISM program, which made it easier for internet firms to supply the NSA with info in response to FISA Court orders. And, while most other big internet firms "settled" with the government over revealing government requests for information, Twitter has continued to fight on, pushing for the right to be much more specific about how often the government asks for what kinds of information. In other words, Twitter has a long and proud history of standing up to attempts to use its platform for surveillance purposes -- and it deserves kudos for its principled stance on these issues.

That said... I'm not really sure that blocking this particular usage really makes any sense. This is public information, rather than private information. And, yes, not everyone has access to "the firehose," so Twitter can put whatever restrictions it wants on usage of that firehose, but seeing as it's public information, it's likely that there are workarounds that others have (though, perhaps not quite as timely). But separately, reviewing public information actually doesn't seem like a bad idea for the intelligence community. Yes, we can all agree (and we've been among the most vocal in arguing this) that the intelligence agencies have a long and horrifying history of questionable datamining of other databases that they should not have access to. But publicly posted tweet information seems like a weird thing for anyone to be concerned about. There's no reasonable expectation of privacy in that information, and not because of some dumb "third party doctrine" concept, but because the individuals who tweet do, in fact, make a proactive decision to post that information publicly.

So, perhaps I'm missing something here (and I expect that some of you will explain what I'm missing in the comments), but I don't see why it's such a problem for intelligence agencies to do datamining on public tweets. We can argue that the intelligence community has abused its datamining capabilities in the past, and that's true, but that's generally over private info where the concern is raised. I'm not sure that it's helpful to argue that the intelligence community shouldn't even be allowed to scan publicly available information as well. It feels like it's just "anti-intelligence" rather than "anti-abusive intelligence."
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: data mining, intelligence, intelligence community, public info, surveillance, tweets
Companies: dataminr, twitter


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Anonymous Coward, 11 May 2016 @ 6:40am

    "I don't see why it's such a problem for intelligence agencies to do datamining on public tweets"

    I'm in agreement. Public information should be available for the commons, which means not dictating limits on its use. Those uncomfortable with this may want to rethink their use of public sites or more private ones like Facebook where power is ceded to the hosting entity.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 11 May 2016 @ 6:41am

    You really don't know what the scope is until you know what the scope is.

    I'd love to see an actual sample of this data. Presumably the format was documented somewhere?

    link to this | view in chronology ]

  • icon
    Andreas (profile), 11 May 2016 @ 6:53am

    link to this | view in chronology ]

  • icon
    Mason Wheeler (profile), 11 May 2016 @ 6:55am

    Seems to me it's the same issue as ALPRs. Large amounts of data that, in individual instances, is indisputably public, can still be abused in the aggregate.

    If not, what is the difference between data-mining "the firehose" and building ALPR record databases?

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 11 May 2016 @ 6:59am

      Re:

      There's nothing wrong with ALPR databases either as it's covered by the first amendment. Although it's much easier an average person not to have a twitter account than a license plate.

      link to this | view in chronology ]

      • icon
        Mason Wheeler (profile), 11 May 2016 @ 7:47am

        Re: Re:

        And yet, Techdirt has posted several articles detailing concerns over ALPR abuse, particularly by law enforcement. My question is, how is this (which there's apparently nothing wrong with) any different?

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 11 May 2016 @ 4:51pm

          Re: Re: Re:

          The legality of the databases is separate from their use. A bomb making book is legal, using that knowledge for illegal acts is not.

          link to this | view in chronology ]

      • identicon
        Anonymous Coward, 11 May 2016 @ 8:15am

        Re: Re: ALPR databases

        There's nothing wrong with ALPR databases either as it's covered by the first amendment.
        Creating an ALPR database is currently legal because of the First Amendment connection. However, despite being lawful to create and to read, there are substantial concerns about how such a database can be used. There are a variety of questionable or outright prohibited uses which can be readily conducted once the data is aggregated. Depending on the particular collection and retention policies of the database, there may be few or no desirable uses for the long term data. For example, retaining five years (or more) of historical data has questionable investigative value. How often would a legitimate investigation benefit from knowing where a given vehicle was seen four or five years ago? If the answer is "rarely" or "never", then perhaps the database should not retain enough history to answer that question. There are definitely some illegitimate uses of very old data, and retaining the data for a long period enables all uses, both legitimate and illegitimate.

        Thus, my stance is that while collecting an ALPR database is legal, many of those who do it are maintaining it in ways that I would like to see changed. License plate recognition and tracking, like some other forms of surveillance, was historically limited by the impracticality of personally conducting it on a large scale. No one would seriously expect the police to spend the manpower to build a license plate database by having officers stand at major intersections and enter all those plates by hand, so no one thought to pass a law limiting such surveillance to contexts with an articulable connection to an open investigation. Now they can very cheaply collect such information via ALPRs, and there is still no law telling them not to do it, so they do because it looks neat and might help some day. I want a law that, at minimum, clearly disallows surveillance that has no better basis than "it might be useful some day."

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 11 May 2016 @ 10:20am

          Re: Re: Re: ALPR databases

          ALPR tech is a weird issue that's always bugged me, and I can't figure out if my concerns are ridiculous, paranoid, or just too slippery-slope-ish.

          Given that the hardware will get smaller and cheaper (and storage space bigger and cheaper), it's possible that ALPRs could become ubiquitous: slap tiny little button cameras on any & every street sign, mile marker, and public structure available. Plus, since the devices are leased/sold by private companies that also host the databases that retain captured info, there's nothing to stop mounting them on every bit of private property owned by someone who'll do it for a little compensation.

          As the density of ALPRs increases, the resolution of the monitored path traveled by an individual car increases. Eventually, this resolution is effectively GPS tracking of every car that leaves its driveway. The metaphor here goes beyond a cop with a camera on every street corner: it becomes a cop car with a running dash cam following every single person 24/7. We'll all have to switch to self-driving cars, if only to avoid all the traffic fines we'd have to pay (or we'll have to insert credit cards into manually-driven cars to pay the near-real-time penalties as they roll in). Then there's the ol' problem of "this car stops on a street known for activity x every Saturday; maybe we should pull it over for 'failure to signal' this coming weekend." (Thinking that that sort of thing mightn't be legal, the words 'parallel' and 'construction' spring to mind for some reason.)

          Even ignoring LEA (warrant-less) access to this sort of information, the private sector control of the data is more than a bit creepy. Imagine health insurance companies being able to ask how often someone stops at McDonalds or Krispy Kreme, auto insurers with black-box access to your driving habits, your boss being able to check if you spend a lot of time at the local watering-hole after work every day, etc. etc.

          Privatizing mass surveillance doesn't quite feel like an egalitarian utopia with lots of togas & flyin' cars. Feels more like one of those cases where corporations being nothing more than everyday people (just like you & me!) with everyday rights isn't exactly 'leveling the playing field'.

          link to this | view in chronology ]

          • identicon
            Anonymous Coward, 11 May 2016 @ 10:24am

            Re: Re: Re: Re: ALPR databases

            Note to self: refresh & read new comments before posting, dammit.

            link to this | view in chronology ]

          • identicon
            Anonymous Coward, 11 May 2016 @ 4:54pm

            Re: Re: Re: Re: ALPR databases

            Ubiquitous ALPR already exists, the best we can hope for now is to make a public database of everyone's data.

            link to this | view in chronology ]

    • icon
      BentFranklin (profile), 11 May 2016 @ 7:24am

      Re:

      There is some parallel here and concern with automated facial recognition as well.

      link to this | view in chronology ]

    • icon
      GMacGuffin (profile), 11 May 2016 @ 8:34am

      Re:

      "...what is the difference between data-mining "the firehose" and building ALPR record databases?

      In the aggregate, there is a big difference. Twitter posts track what the user chooses to say publicly. The user has total control over that.

      But ... aggregated ALPR data can track a person's movements and locations, giving a picture of where and whom one deals with. This implicates privacy concerns (urologist office Wed 2:30; married coworker's house at lunch), freedom of association concerns (stop by local KKK HQ), including false positives (cousin borrows your car for a murder), and so on. Big difference.

      link to this | view in chronology ]

      • icon
        Mason Wheeler (profile), 11 May 2016 @ 8:37am

        Re: Re:

        In the aggregate, there is a big difference. Twitter posts track what the user chooses to say publicly. The user has total control over that.

        And a driver has total control over where they choose to drive publicly, and even which route they choose to take to get there. What's the difference?

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 11 May 2016 @ 8:45am

          Re: Re: Re:

          And a driver has total control over where they choose to drive publicly
          Read the part of GP that you did not quote. It is unrealistic to say that people can freely visit embarassing places and keep their movement secret from an ALPR database. They might keep their movement secret from a single individual by engaging in counter-surveillance techniques, but the only way to counter an ALPR is not to drive anywhere important. Driving is ubiquitous in modern society.

          Tracking someone in public because you have a good reason to do so (probable cause) is morally very different from tracking everyone in public just because you can do it cheaply and you might find something interesting.

          link to this | view in chronology ]

        • icon
          GMacGuffin (profile), 11 May 2016 @ 8:49am

          Re: Re: Re:

          And a driver has total control over where they choose to drive publicly, and even which route they choose to take to get there. What's the difference?

          Oh, I dunno ... say you have finally gotten that appointment with that HIV specialist and have no other way to get there but your own car.

          Or say you are active in a protest group that the government would like to take you down for, Constitution be damned. Lots of stories out there about The Man surveilling perfectly legal activism. And say a FBI agent asks you about it and you waffle, and he doesn't like it, so he says you lied to a federal agent. Then you can go to jail over for lying to a federal agent, just because. And you can't prove otherwise because it's his word/notes against yours as they don't record interviews... that sort of thing.

          link to this | view in chronology ]

          • identicon
            Anonymous Coward, 11 May 2016 @ 9:26am

            Re: Re: Re: Re:

            i will go you one better: what if we citizens do some crowd-sourced public observation and tracking of LEOs, what bullshit would they come up with for why that is totally unacceptable ?
            in general, OUR lives should be totally opaque to gummint, and gummint should be totally transparent to us, the theoretical owners and bosses of 'our' (sic) gummint...

            link to this | view in chronology ]

          • icon
            Mason Wheeler (profile), 11 May 2016 @ 10:01am

            Re: Re: Re: Re:

            Oh, I dunno ... say you have finally gotten that appointment with that HIV specialist and have no other way to get there but your own car.

            Why say that? Show me a scenario in which public transportation and cabs are unavailable, and yet normal businesses are open for business and the roads are clear for normal driving (ie. not a natural disaster situation) and I will show you a contrived scenario that has no place in a discussion of real-life events.

            link to this | view in chronology ]

            • identicon
              Anonymous Coward, 11 May 2016 @ 10:13am

              Re: Re: Re: Re: Re:

              Try living in a rural area, where public transport is non existent, and you either have you own transport, or get a friend or family member to transport you to your destination.

              link to this | view in chronology ]

            • identicon
              Anonymous Coward, 11 May 2016 @ 10:33am

              Re: Re: Re: Re: Re:

              True, the entire Midwest is a bit contrived.

              link to this | view in chronology ]

        • identicon
          Anonymous Coward, 11 May 2016 @ 10:40am

          Re: Re: Re:

          I guess 'space-time' could count as the third-party in the third-party doctrine: where you go, when you go there, and how frequently aren't really anything but spatio-temporal metadata. It's not like the ideas of 3-rd party doctrine and metadata have ever been abused.

          link to this | view in chronology ]

        • identicon
          Anonymous Coward, 11 May 2016 @ 4:56pm

          Re: Re: Re:

          If you speak, drive, or walk in public you have no expectation of privacy.

          link to this | view in chronology ]

          • identicon
            Anonymous Coward, 11 May 2016 @ 6:33pm

            Re: Re: Re: Re:

            What about stalking laws & restraining orders? There seem to be some recognized exceptions. Thank god for HIPAA, or colonoscopes would have 'To Protect and To Serve' written on 'em. (Or more likely 'We're the Government, and Fuck You :)'.)

            link to this | view in chronology ]

      • icon
        HegemonicDistortion (profile), 11 May 2016 @ 9:59am

        Re: Re:

        And it has First Amendments as well, particularly the right of association. Ubiquitous license plate readers can be used to essentially collect association information, which, the Supreme Court has ruled, can violate one's free association rights (when the government does/uses it).

        link to this | view in chronology ]

        • icon
          HegemonicDistortion (profile), 11 May 2016 @ 10:16am

          Re: Re: Re:

          Good Lord, "First Amendments as well"?? What are you smoking, beagle?

          "...First Amendment implications as well."

          link to this | view in chronology ]

    • identicon
      Anonymous Coward, 11 May 2016 @ 11:41am

      Re:

      The difference between the firehose and ALPR, is that ALPR record databases are transformative, whereas with twitter, you're intending for people to be able to find and read your messages.

      ALPR is taking public license information that is in many places mandated by government, and capturing its movement in bulk.

      Tweets are fully voluntary broadcasts.

      And that's the difference.

      link to this | view in chronology ]

    • icon
      Wyrm (profile), 12 May 2016 @ 9:15am

      Re:

      If say there are two main differences.

      Others have already pointed out the choice factor. You actively choose to spread your words on Twitter. You cannot choose to hide your license plate when driving your car, hence forcibly "communicating" your movements to anyone looking at your car.

      I would add that there is also a difference in the level of publicity. When you type a tweet, you know it will be available to everyone, and indexed for easy search. When driving your car, another individual would have to follow you around to know where you've gone. ALPR is taking this to a level that is 1. not originally expected and 2. on a level that a single individual cannot match.

      As I see it, those two points make ALPRs a completely different issue.

      link to this | view in chronology ]

  • identicon
    Anonymous Coward, 11 May 2016 @ 7:55am

    Anti-abusive intelligence

    It feels like it's just "anti-intelligence" rather than "anti-abusive intelligence."
    Show me an intelligence agency that's not abusive and I'll show you an agency that deserves convenient access to public information. Currently, every relevant intelligence agency is either (a) known to be abusive or, (b) not known to make substantial efforts to avoid abuses (both at the individual and organizational levels). Thus, for now, there is no difference between penalizing all intelligence agencies and penalizing only the abusive ones. Penalizing all of them is easier than specifying who can and cannot have access, since that would mean building an access control list and then dumping all of them on the blacklist. For now, penalizing all of them targets exactly the right set.

    link to this | view in chronology ]

  • icon
    theBlueSage (profile), 11 May 2016 @ 8:02am

    Piping Data has a cost

    Coming from a pure firehose perspective, having a consumer of the firehose comes with a cost. It is not like the hose is pouring info into the ether and people stand under the spray. When someone attaches to the firehose it create a separate hose connection. The data flowing through that hose connection to the consumer has a network bandwidth cost. How much that cost is will depend on the total amount of firehose consumed. IMHO, I would assume that the snoops will have that hose in wide open, 24/7 mode. Depending on the size of the draw (one listening process, 5000 listening processes) the bandwidth cost can get huge.

    link to this | view in chronology ]

  • icon
    Dr. David T. Macknet (profile), 11 May 2016 @ 8:04am

    Posturing for PR Purposes

    Part of this may be PR as a tech company to say, "hey, we're not in the government's pocket!" They're trying to spin it as if they still don't take whatever contracts come their way, in other words, when in reality they haven't ended those relationships.

    Part of this may be PR as a company trying to attract job talent ... and getting responses like mine:

    A couple/three years ago I interviewed with Dataminr. I ended the interview process after I learned that their clients included various agencies, because I just didn't want to be a part of a company that did analysis for "those people."

    The tech industry as a whole has tended to lean towards privacy over surveillance, and Dataminr is likely losing out on a lot of talent because they're perceived as being in the pocket of those agencies.

    As a data scientist (see that new buzzword?) I'm interested in their volume of data, and the science behind processing that data. However, until they cease to be a tool of oppression, they'll not be attractive to me. And I don't think I'm alone in making this assessment.

    link to this | view in chronology ]

    • icon
      HegemonicDistortion (profile), 11 May 2016 @ 10:01am

      Re: Posturing for PR Purposes

      Maybe they mine in the web to find all the damn missing 'e's in company names.

      link to this | view in chronology ]

    • identicon
      Anonymous Coward, 12 May 2016 @ 2:33pm

      Re: Posturing for PR Purposes

      > The tech industry as a whole has tended to lean towards privacy over surveillance

      You obviously haven't paid any attention to pretty much anything over the past decade or so if you believe that. Most of silicon valley has been built around the for-profit surveillance model.

      link to this | view in chronology ]

      • icon
        John Fenderson (profile), 13 May 2016 @ 7:31am

        Re: Re: Posturing for PR Purposes

        The tech industry has this bizarre attitude: privacy is sacred when it comes to "sharing" data outside of the tech industry. Inside of the tech industry, this data "sharing" is considered a tremendous virtue -- because there's money in that pile of data.

        It's a level of hypocrisy that borders on stunning.

        link to this | view in chronology ]

  • icon
    orbitalinsertion (profile), 11 May 2016 @ 8:39am

    The problem. Hmmm, not exactly huge, but why should they have special treatment?

    Anyone else trying to do the same would be flagged as abusive by servers.

    Spook agencies already slurp data their own way. Maybe they should actually be using data gained by their methods rather than every possible convenience of access ever.

    You know they are mostly bothering people who don't do anything other than have an opinion, and only certain sorts, at that. They behave like everyone else who is "vigilant" about "terrorism", freaking out over someone doing algebra, but go ahead and publicly demonstrate for white, christian, right-wing insurrection and that is hunky-dory. Gee, make a lifelong career of it. (Whether the run files on them or not, they aren't the ones who mysteriously end up on no-fly lists, etc.)

    More broadly, no on - not agencies or marketing arms of companies, properly minimize or anonymize and protect any data. Twitter firehose would be neither, by nature. And a very small percentage of the use of this sort of mass data is anything positive. And it could be. So why give access to such data to such poor stewards.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 11 May 2016 @ 8:39am

    What is concerning here is that the intelligence agencies are forcing their way into all the data streams on the Internet. They are not targeting their data collection, or rather are collecting data that is of political significance, rather than genuine national security interest. This goes double when it is data on foreigners that they are collecting, especially with the US propensity to interference in foreign countries when it looks like the internal politics are going against their political and commercial interests.

    link to this | view in chronology ]

  • identicon
    cecil, 11 May 2016 @ 8:47am

    It's my hose

    It's twitters hose and their right to demand that the contract be honored. Regardless of any other issues, requesting specific performance on a contract is _always_ an option.

    link to this | view in chronology ]

  • icon
    Slinky (profile), 11 May 2016 @ 10:03am

    Snowden 60 minutes interview

    As far as I remember, when Ed Snowden was interviewd by 60 minutes, he said something about a government program that was able to not only datamine what you type, but HOW you type. The program would be able to recognize or "fingerprint" a user from his/her characteristics/patterns. This way it was possible to profile people. (Please check out the 60 minutes interview with Snowden, I may remember it wrong :/)

    link to this | view in chronology ]

    • icon
      John Fenderson (profile), 13 May 2016 @ 7:35am

      Re: Snowden 60 minutes interview

      I can't speak to what Snowden said, but it is absolutely possible to identify individuals based on their typing patterns. It is also eerily accurate under the right circumstances. This technique has been in use to varying degrees for at least 30 years.

      link to this | view in chronology ]

  • identicon
    Annonimus, 11 May 2016 @ 10:03am

    Crying Wolf

    The inteligence agencies have abused every angle they can get their hands on to expand their information haystack. Why on earth should Twitter trust them not to bring up that the firehose is something Twitter shares with them, but they are refusing other stuff in their program?

    The Intelligence community and the Justice Department have abused every piece of good will that the Silicon Valley has extended them. Why on earth should Twitter trust them not to attempt to spin any sort of access into a story of Twitter being picky about what it shares and what it doesn't?

    link to this | view in chronology ]

    • icon
      That One Guy (profile), 11 May 2016 @ 12:35pm

      Re: Crying Wolf

      Yeah, the DOJ's actions in the recent Apple case made it clear that helping out government agencies is only going to hurt you down the line if you ever refuse a request.

      'They were willing to give us X, but now they're refusing to give us Y because they care more about their bottom line than helping us catch criminals/terrorists/communists.'

      I think David T. Macknet above is probably partially right, this is at least part PR move, but I imagine part of it is simply to reduce the government's ability to use Twitter's willingness to work with them against them down the road.

      link to this | view in chronology ]

  • identicon
    @b, 12 May 2016 @ 2:57am

    False dichotomy though....

    Mason already has it covered above but

    1) There's mountains of difference between ten dudes tweeting and what an entire country is tweeting "in aggregate". Or everything those ten dudes have ever tweeted since sign up / the age of consent.

    2) Microbloggers living under oppressive governments are on an online hair-trigger. Ever make a copy-paste mistake? Accidentally typed your password in the wrong text field? Regret something u wrote online? Operated a phone when a bit drunk? Heard of someone famous deleting a tweet?

    3) Social networks have utterly complicating Public/Private dualism. With every node in the network, and every bloop arriving from your Contacts List, comes 50,000 shades of grey.

    Do we really want to CC the government on every digital message ever sent that didn't require a password to read?

    Will it be to hell with broadcasters? Encrypt it or die?

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 12 May 2016 @ 3:17pm

    Mike,

    the problem is not the agencies mining the public data. The nuance is not providing them an easy way of doing it.

    Basically if they want it, they can work for it instead of it being handed to them by the company.

    link to this | view in chronology ]

  • icon
    M. Alan Thomas II (profile), 12 May 2016 @ 8:52pm

    Open-source intelligence is quite neat and can give a much more nuanced idea about what's going on than paranoid theorizing based on secret information. We should do more of it.

    link to this | view in chronology ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.