NSA Personnel: Search For Needles Not Being Helped By Continual Addition Of Hay To The Stacks
from the can't-use-intelligence-if-you-can't-locate-it dept
The NSA's desire to harvest as much data as possible lies at the root of its defense of the nearly-dead Section 215 collection. Although mostly useless, it is still being defended as a data collection the NSA needs to have around "just in case." Data -- lots and lots of it -- is good and useful and helps locate terrorists. And, according to those running these collection programs, the only thing better than data is more data. Hence: "collect it all." Hence, also: a gargantuan data center in Utah that is still in danger of losing its water supply.
"Collect it all," proclaimed Keith Alexander, as the NSA amassed haystack after haystack with the needles seemingly little more than an afterthought. "Collect it all," the analysts yelled back, frantically running haystacks through their analytic spinning wheels in hopes of appeasing King Alexander with the occasional production of counter-terrorist gold.
The Intercept's cache of documents reveals not everyone in the NSA is so enthralled with haystack-building. Adding haystacks doesn't aid in intelligence efforts. It just adds more hay. Sooner or later, everything bottlenecks at the analytic point. Worse, it adds to the amount of cleanup that must be done before the data can even be analyzed, as well as possibly removing "signal" while filtering out "noise."
These (leaked) informal documents contain conversational discussions of intelligence topics that come from about as "everyman" a perspective as spooks sitting in a sea of servers can actually have.
From "Too Many Choices," by the "SIGINT Philosopher:"
"Analysis paralysis" isn't only a cute rhyme. It's the term what happens when you spend so much time analysing a situation that you ultimately stymie any outcome. It's what happens inside your grandfather's brain while you wait endlessly him to make his move on the chessboard. It's what happens when I stand in of the jams and jellies at the supermarket. And it's what happens in SIGINT when we have access to endless possibilities, but we struggle to prioritize, narrow, and exploit the best ones.A.k.a, the Netflix problem, for those more prone to stream entertainment then purchase jams and/or jellies. If nothing immediately stands out, the tendency to cycle through list after list of possibile choices results in more time spent looking for something to watch than actually watching something.
When lives are potentially on the line, adding more data makes it harder to find what you're looking for in a timely fashion. Stack up enough hay, and more time will be spent examining and discarding false positives and negligible intelligence than will be spent looking at useful data that might point analysts towards an impending threat.
The SIGINT mission is far too pressing for many team-building activities or brain-storming sessions aiming to improve our organizational approach to analysis. At the same time, the SIGINT mission is far too vital to unnecessarily expand the haystacks while we search for the needles. Prioritization is key.But this doesn't seem to fit in with the NSA's general approach to intelligence gathering. Nearly every program it runs is an effort to gather even more data than it already has. Every exploit it plants gives it another source for intel. Every new agreement it makes with foreign countries' intelligence services gives it another set of haystacks to dig through. There is no apparent prioritization inherent in its intel gathering. Everything is potentially significant, but its significance can only be determined after it is collected and analyzed. The agency prefers collecting in bulk to targeting. It has been this way for years. So, it's no surprise that those questioning this approach may find themselves doing the following: [Side note: this paragraph says some interesting things about the Section 215 program capabilities and comprehensiveness.]
Recently I tried to answer what seemed like a relatively straightforward question about which telephony metadata collection capabilities are the most important in case we need to shut something off when the metadata coffers get full. By the end of the day, I felt like capitulating with the white flag of, "We need COLOSSAL data storage so we don't have to worry about it," [...] because getting the metrics for empirical evidence to review was so very difficult and, frankly, I'm still a little scarred by the experience.The emphasis is "more hay," not "better targeting." And no one seems to know which collections are actually returning useful intel -- at least not in an agency-wide sense.
There's a running joke in the S3 community that we'll only know if collection is important by shutting it off and seeing if someone screams.And that screaming may only be because someone thinks their particular haystack-gatherer is useful, rather than it actually being useful.
Despite all of this incoming intel, terrorists are still evading the worldwide surveillance net cast by the NSA and its global partners. Officials tend to blame this on leaks, encryption, "going dark" -- anything that doesn't raise the uncomfortable possibility that the needles it's looking for are already swimming through its massive haystacks. This isn't because the NSA doesn't know what it's looking for. It's because it can't find what it's looking for.
Snowden... noted in an interview with the Guardian that the men who committed recent terrorist attacks in France, Canada and Australia were under surveillance—their data was in the haystack yet they weren’t singled out. “It wasn’t the fact that we weren’t watching people or not,” Snowden said. “It was the fact that we were watching people so much that we did not understand what we had. The problem is that when you collect it all, when you monitor everyone, you understand nothing.”Those in the analytic trenches seem to feel the NSA collects too much. Upper-level officials seem far less concerned. The NSA collects to collect. It collects "just in case." This saves intelligence officials from the unlikely event of having to explain how a gap in coverage resulted in a terrorist attack. It's CYA by massive data centers. The massive, overlapping collections are just as likely to result in an unthwarted terrorist attack, but it very pointedly won't be because the NSA didn't try.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: collect it all, haystacks, needles, nsa, surveillance
Reader Comments
Subscribe: RSS
View by: Time | Thread
Hay scales.
If you scale your work to your monetary resources, it's much easier to just process proportionally more input. This just requires expanding your machinery, not the analysts being allowed to view security and privacy sensitive and classified information.
So, of course, you also get proportionally more output.
And if we take our own word redefinitions seriously, the moment human analysts take a look at individual data sets, we are "starting" to conduct a search requiring judicial warrants. Unfortunately, judges don't scale. So there is not much of a point to scale up the human resources for per-case analysis since we can't keep up the flow of warrants required to let them do their job.
So instead, computers have to do the "non"-searches. And they'll turn up what looks suspicious to a computer. Which are patterns of behavior/communication that are so braindead that those programming the searches anticipate them.
[ link to this | view in chronology ]
Re: Hay scales.
[ link to this | view in chronology ]
Re: Hay scales.
[ link to this | view in chronology ]
Re: Re: Hay scales.
Actually those two things would be the effect of having decentral data storage at private companies too, but having to deal with probable cause is such a pain.
Overall, what I am getting from what is being written by SIGINT Philosopher is that a lot of money is used on practical projects while in reality they need to expand research to actually gain meaningful results from such data. Since the type of research needed do not rely on continuous data streams, it would seem much more effective for NSA to deprecate the mass collection and rely on research to determine the data they can use.
Since research is pretty expensive in man-hours, I am sure that scaling that department up will be able to fill the current budget anyway.
[ link to this | view in chronology ]
Re: Re: Hay scales.
[ link to this | view in chronology ]
Do they have a union?
[ link to this | view in chronology ]
Re:
"Dear Congresskritters, you're sending too much funding to my employer who is then able to command me to do foolish, unproductive stuff. Please stop."
I don't think you've thought this through. Besides, we're already demonizing too many whistleblowers. The few who have gotten away with complaining their bosses are breaking the law only barely show up on the radar.
[ link to this | view in chronology ]
Reward results not progress.... ?[Re: Re:]
Maybe that would direct the focus on only doing those things that achieve results.
[There's no guarantee that we'll end up liking the results we may end up getting... of course.]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
We should ask ourselves
That is the root of the problem. So long as you believe that then "collect it all" seems justified as the only way to go.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Ineffective
Eventually shit is going to blow exactly because of the "corrective" measures. And in the mean time the subjects are getting a rash.
[ link to this | view in chronology ]
Because it's much better to be seen as incompetent because there was too much information, than incompetent because there was not enough information?
Either way the NSA has an excuse, and our elected officials are dumb enough to swallow it.
Ever get the feeling that we're just not firing enough people?
[ link to this | view in chronology ]
Re:
Start with politicians. Use what I'll call the 'Rabid Dog Method'. If a candidate foams at the mouth when discussing terrorism or crime or for that matter anything, vote for someone else. Anyone so invested in only one set of issues cannot be focused on what is good for everything, or anything.
But then I think of the way political parties and money in politics works and I get to...oh......wait....
[ link to this | view in chronology ]
It's just right for the propper mission
For catching terrorists, where terrorists fit the traditional definition of outsiders trying to use terror to influence/change our government or society, "collect it all" is a useless strategy.
Given that most honest people agree with that position, in what situation does "collect it all" does it make sense?
The only one I can think of, off the top of my head, is where the government wants to protect itself from its citizens.
We have seen this situation play out countless times throughout history and continuing into today. You need to look no further than; Nazi Germany, Soviet Russia, Communist China, Islamist Iran, (I'm not sure how to categorize) North Korea. In modern times, sadly, we can add Great Briton and the United States to that dismal array of countries in fear of their citizenry.
Here in the US of A, the Constitution and especially the fourth amendment, serve as a bulwark against the oppressive mass surveillance of the common man by the government, or at least it used to. Unfortunately too many people are cowered by the fear of terrorists to think straight. The odds of dieing in a terrorist attack on US soil is so far down on the list, that personally I don't even think about it. I am far more likely to die in a car accident, drowning in a swimming pool, getting the flu, heck being struck by lightning while winning the lottery. Have people died in a terrorist attack, yes. Will people do so in the future, most definitely. Will it be me, or someone I know, not bloody likely.
Nothing is ever totally safe.
The ultimate goal of our government is to safeguard our freedoms. Therefore whenever you hear a government agent or politician say that their primary goal is to keep you safe, be afraid. Not of whatever boogeyman they have currently dragged up, but of their motives. What they are really trying to do is scare you into letting them strip you of your liberty and freedom.
As it's been said by better men than me over the years;
[Benjamin Franklin https://en.wikiquote.org/wiki/Benjamin_Franklin]
[Patrick Henry https://en.wikipedia.org/wiki/Give_me_liberty,_or_give_me_death!]
[Emiliano Zapata Salazar https://en.wikiquote.org/wiki/Emiliano_Zapata]
[ link to this | view in chronology ]
Bad for finding terrorist. Great for finding dirt on people.
[ link to this | view in chronology ]
Effectiveness Or Lack Thereof Is Not The Main Thrust
1. It is ethically wrong.
2. It goes against the Bill of Rights
3. It is illegal
4. It may not work well.
The main reason bullet 4 is a weak one is because I can make a very strong counter-argument to the article above:
OK, so the data is just bigger haystacks today. But we don't want to be like the IBM CEO who estimated a market for maybe 6 computers in the world. The reality is that Moore's, Kryder's, and Nielsen's laws are all in effect, and it's only a matter of time before Big Data analytics tools actually manage to make sense of this massive haystack.
While we maybe can't make sense of the haystack today, having data that goes back many years will prove "useful" in the future when we have greater analytical compute capacity. With years of data, not only is their more information to mine, but trend or panel data can be derived, as opposed to just "snapshot in time" data.
So, I'm not convinced the NSA is stupid to want all that data. I just think they are forward-looking. Unflappably insidious, for sure, but not stupid.
[ link to this | view in chronology ]
Re: Effectiveness Or Lack Thereof Is Not The Main Thrust
[ link to this | view in chronology ]
Re: Effectiveness Or Lack Thereof Is Not The Main Thrust
Ed C. is correct. The same things that improve the ability to collect and analyze huge amount of data also increase the amount and complexity of the data to be analyzed.
It's a bit like crypto: increased computing power makes breaking crypto easier, but it also makes it possible to build even stronger crypto. It's a perpetual race.
The ineffectiveness argument kindof drives me nuts. Not because its' wrong (it isn't wrong) but because it's allowing the terms of the debate to be derailed from the real argument (ubiquitous surveillance is wrong) to one of technical capabilities.
[ link to this | view in chronology ]
...
[ link to this | view in chronology ]
Just declare everyone a terrorist.
[ link to this | view in chronology ]
Re: Just declare everyone a terrorist.
[ link to this | view in chronology ]