Latest Revelations Show How Collecting All The Haystacks To Find The Needle Makes The NSA's Job Harder
from the and-makes-us-all-less-safe dept
Yet another post about the latest NSA revelations about collecting buddy lists and email contacts. As we'd mentioned in the original post, the story noted that this data collection was at times overwhelming. Here's the Washington Post's report on this point:The volume of NSA contacts collection is so high that it has occasionally threatened to overwhelm storage repositories, forcing the agency to halt its intake with “emergency detasking” orders. Three NSA documents describe short-term efforts to build an “across-the-board technology throttle for truly heinous data” and longer-term efforts to filter out information that the NSA does not need.Here's a slide from the leaked NSA presentation, in which it urges people to be more careful about what kind of data it collects via this program, saying they're trying to "store less of the wrong data" and "shift the collection philosophy at the NSA" to "memorialize what you need" from "order one of everything off the menu and eat what you want."
Spam has proven to be a significant problem for NSA — clogging databases with data that holds no foreign intelligence value. The majority of all e-mails, one NSA document says, “are SPAM from ‘fake’ addresses and never ‘delivered’ to targets.”
In fall 2011, according to an NSA presentation, the Yahoo account of an Iranian target was “hacked by an unknown actor,” who used it to send spam. The Iranian had “a number of Yahoo groups in his/her contact list, some with many hundreds or thousands of members.”
The cascading effects of repeated spam messages, compounded by the automatic addition of the Iranian’s contacts to other people’s address books, led to a massive spike in the volume of traffic collected by the Australian intelligence service on the NSA’s behalf.
After nine days of data-bombing, the Iranian’s contact book and contact books for several people within it were “emergency detasked.”
Of course, that's bogus, and the data deluge discussed in this program demonstrated why. Collecting it all makes it harder to find the right information. Piling more hay on the haystack doesn't make it easier to find the needle, it makes it harder. That's one of many reasons why we're so concerned about these bulk data collection programs. Not only do they rarely seem to turn up useful information, but they also seem to better obscure important information by flooding the system with bogus data.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: haystacks, keith alexander, needles, nsa, nsa surveillance, too much information
Reader Comments
Subscribe: RSS
View by: Time | Thread
Sounds like some plot from some weird Japanese animation where some wacko from some city aims to catch weird monsters for no reason other than having them.
Hi! I'm Alexander Keith from Syracuse and I'm gonna be a Surveillance master!
*Alexander throws PRISM at DATA*
*Alexander caught PENIS ENLARGEMENT spam*
Yep. We need a parody.
[ link to this | view in chronology ]
Need a parody? Was: Re:
http://keiths.ca/
[ link to this | view in chronology ]
Mmmm, Master Plan
[ link to this | view in chronology ]
Re: Mmmm, Master Plan
So, you (and others below) accept the notion that your email provider can recognize spam and filter it out, but NSA can't?
Yeah, let's all laugh at poor old incompetent NSA! Don't even have spam filters!
No. Didn't mention this in my first post as seems obvious, but you're being channeled into seeing NSA as just another clueless ineffective gov't agency, lessening its crimes in your mind, and that has to be exactly as they wish.
Just more diversion from NSA crimes. The small but real opening Snowden gave us is being frittered away -- in a manner that's actually to NSA benefit.
[ link to this | view in chronology ]
Re: Re: Mmmm, Master Plan
I remember posts from you before calling Snowden an NSA lackey, that his whole leak is a sham staged by the NSA. What changed?
[ link to this | view in chronology ]
Re: Re: Mmmm, Master Plan
Apparently they can't. And if they could you just need to drive a sheer volume of mails to get out of their collection and sneak some coded message as if it was some spam.
you're being channeled into seeing NSA as just another clueless ineffective gov't agency
You need to check your sarcasm detector.
[ link to this | view in chronology ]
Re: Re: Mmmm, Master Plan
Have a DMCA vote.
[ link to this | view in chronology ]
1. It turns out that the NSA does not have the world's most awesome spam filter. Indeed, it seems likely they get more spam sent to an account that they account itself might receive once the mail host's spam filters get done with the traffic.
2. The best way for terrorists to avoid NSA scrutiny of their email is to become massive spammers. Ironically, they would likely cause more problems for the USA with their spamming activities than with their terrorist activities.
3. Emails that look like mass spam and phishing attacks will be the best way for terrorists to send emails in the future. Emails with seemingly random text that contain links to obvious phishing attacks could easily contain coded messages that the NSA would ignore because they aren't storing phishing emails promising penis enlargement.
[ link to this | view in chronology ]
Re:
Just sayin
[ link to this | view in chronology ]
Re:
Pretty soon they will be pissed at the Nigerians for taking their terrorist funds.
Scammers vs Terrorists!
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Deuge
Also, for most people, traffic analysis will lead to nothing important for the NSA because they are doing nothing of interest to them. Vacuuming all traffic threatens to overwhelm the analyst with useless data that will never be of any value.
Since I do not knowingly hangout with terrorists of any type the chances that any of my conversations, emails, etc. will lead to an intelligence break through is about 0. Collecting my data only clutters up the disk and will never provide any useful information.
The real risk is that the NSA angers enough people like me and the resulting political and commercial pressure forces an over reaction that hurts US businesses and the ability of the NSA to actually monitor foreign enemies.
[ link to this | view in chronology ]
Re: Deuge
You may not knowingly do so, but that doesn't mean you don't actually do so. Between the rather loose definition of "terrorist" and the fact that you don't know who the people you hang out with also hang out with, I'm guessing that the odds that you're linked with a terrorist is higher than you might think.
[ link to this | view in chronology ]
Re: Deuge
When you find a weed in your garden, you dig down to the roots and yank the whole thing out. You don't prune and/or nurture it.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Logically, then, Google can't work!
[* More accurately, soon as the system is informed that there IS a needle to be found: meaning for whatever reason an individual (dissident) is focused on, then it can be entirely effective. I think it obvious that the national surveillance system, including the mega-corporations, is to control the populace, not protect them.]
Google's ability to target you for advertising is EXACTLY what NSA needs to target you as political dissident, NOT coincidentally.
[ link to this | view in chronology ]
Re: Logically, then, Google can't work!
Google? They help their end users find whatever it is they're looking for. Google isn't just indexing the internet and looking for one thing, it's looking for tens of billions of things, based on search engine queries.
Oh wait...I forgot. You're an idiot who can't bother to use logic and reason.
[ link to this | view in chronology ]
Re: Re: Logically, then, Google can't work!
The NSA is ostensibly looking for one thing (terrorists) in the overwhelming sea of haystacks.
First: said you'd stop replying to me, but you can't.
Listen, sonny. You obviously didn't even read mine, are just yapping at sight of my screen name. Here's what I wrote above that directly addresses the needle bit: "Of course, [Mike] has as premise that NSA is looking for needles instead of broad trends." -- SO WHY DO YOU REPEAT THAT PLUS SOME AD HOM? -- Cause you're just a nasty little kid trolling this fine site.
[ link to this | view in chronology ]
Re: Re: Re: Logically, then, Google can't work!
It's true. I didn't read your post. That's how, in my reply, I was able to respond to a point you made and countered it...wait what?
[ link to this | view in chronology ]
Re: Re: Re: Re: Logically, then, Google can't work!
>>> "Listen, sonny. You obviously didn't even read mine,"
It's true. I didn't read your post. That's how, in my reply, I was able to respond to a point you made and countered it...wait what?
So long as you're OFF-TOPIC, it's FINE with me! You've not countered my points.
So long as "The Market" (if not NSA directly) rewards Google for spying, do you expect it to do LESS of it?
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Logically, then, Google can't work!
You do realize what you've just said, don't you? You don't want a discussion on the topic of the article, you want to de-rail it. Oh and yes, I did counter your point. You said that because Google is able to find things, then obviously the NSA should be able to as well, which I countered with that Google and the NSA search for entirely different things and in completely different ways. Not to mention the fact that at best, NSA staff are incompetent, what with their Utah datacenter catching on fire at least ten times.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Logically, then, Google can't work!
[ link to this | view in chronology ]
Re: Logically, then, Google can't work!
The search engine part wants to know everything that's publicly available so you can search for it. Sort of a PRSIM of websites except that they don't go into what is locked down or closed.
I know you have wet dreams of going Rambo and entering Google headquarters with machine guns firing an impossibly large amount of bullets while looking and smelling definitely macho. But you see, it's not what you are thinking...
[ link to this | view in chronology ]
Re: Re: Logically, then, Google can't work!
Actually Google gives a shit to what you do.
Darn right it does: that's how it gets money, by tracking me all over the net. It's as invasive and controlling as it can be, but has plans for more.
I'd already written a new tag line that answers your drivel:
So long as "The Market" (if not NSA directly) rewards Google for spying, do you expect it to do LESS of it?
[ link to this | view in chronology ]
Re: Re: Re: Logically, then, Google can't work!
That's the difference ;)
[ link to this | view in chronology ]
Strange times
[ link to this | view in chronology ]
Of COURSE most email is spam
If you accept the "collect it all" position, just for the purpose of argument, then the problems with this become obvious, both in terms of scale and searchability. If you don't accept that position, then this leaves the NSA (and friends) with the problem of discerning -- prior to collection -- which traffic is and isn't junk. Either way, these alternatives pose serious technical problems, even before we get to questions of legality, ethics, long-term benefit to the nation, etc.
And as a side note, let me add that in recent years spammers have gotten quite crafty about individualizing their messages to achieve traceability: for example, per-message differences in whitespace (often at the end of lines, where's hard to notice) have been used in order to figure out which abuse victim is the one reporting them and thus which one should be retaliated against. This same mechanism could also be used to bury useful information in massive spam runs: to the casual observer, it would look like another 300-million message incident. But to the single recipient within those 300M, it could be coded message.
[ link to this | view in chronology ]
Re: Of COURSE most email is spam
[ link to this | view in chronology ]
Of course it's making their job harder! And any first year CS student could spot the flaw a mile away.
They should be collecting hay in heaps instead of stacks. Much easier to search.
It's shameful that the NSA doesn't know of this.
/It's a programming joke...move along. And please don't punch the nerd.
[ link to this | view in chronology ]
overwhelming spam
Spam seems to be a logical counteraction to the NSA's dataslurping. I will be surprised if we don't end up with overloaded internet infrastructure shortly.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
http://rt.com/news/governments-businesses-evading-nsa-196/
[ link to this | view in chronology ]
Statistics...
I think because of this mass of statistical data the NSA has does a lot of bad math for "good" intentions...
I see where they could easily think that the metadata collected can easily create a psych profile for pretty much every smart phone user on Earth.
Bear in mind I'm putting rights aside for a moment as it is fairly obvious that the 4th Amendment was violated...I'm here to talk about math:
The mathamatical flaw, it seems, isn't in the data collected about us creating a good psych profile..it's the quantity in which it was collected. When collecting metadata (variables in statistics and marketing) it's alwats good to have good quality data in small organized chunks...The NSA gets an F in Statistics and Marketing...which deals in predictable outcomes...too much data can cause issues when data is actively being sought out...
In short...mathematically, the NSA had good quality data...but they now have so much data that a super computer has trouble finding crorelation data...
Now in my profession, when a correlation study is run, we try to do a narrowed search about the specific data we need...the term "terrorist" is extremely broad in my opinion as psych classification metadata because there are so many things to look at and collect for to match thay profile...The NSA failed so badly to narrow the data they needed that the abnormalities could not be spotted sufficiently...
I can now conclude that the entirety of the NSA completely failed in preventing Behngazi and Boston.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]