Just So We're Clear: More Data Isn't Better Data

from the it's-just-more-work dept

New data-retention policies went into effect in the UK this week, forcing ISPs to store details of all user emails and VoIP calls for a year, just in case law enforcement or the security services want to thumb through them. The government's intent is to mine the data to try and recognize patterns in relationships and contacts that will help them find terrorists and criminals. The idea that all of this data is being stored by ISPs makes privacy activists shudder, and their worry is not unfounded. But it's also important to understand that the idea, that by capturing all this data, the government can easily root out terrorists, is bunk. More data doesn't equal better data; it just makes it a hell of a lot more work to dig out useful information. It also raises the possibility of discovering false patterns that waste law enforcement's time and suck in innocent people. Recently, a guy in Wales found himself in the middle of an armed anti-terror raid on his home after somebody told police that they thought he might be a terrorist because he had soundproofing gear and wiring. He wasn't a terrorist, but rather a musician with a home recording studio. If police will go to such lengths based on unverified, anonymous tips, the thoughts of the conclusions they'll draw from having an entire country's email and VoIP records at their fingers should raise a few eyebrows.
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: data retention, europe


Reader Comments

Subscribe: RSS

View by: Time | Thread


  1. identicon
    EnricoSuarve, 8 Apr 2009 @ 3:54am

    To be even clearer...

    ...as the article you actually link to mentions, this is NOT a UK directive - it comes from the EU

    A good article can be found at http://www.guardian.co.uk/commentisfree/libertycentral/2009/apr/06/internet-houseofcommons (I got this from http://www.opendemocracy.net/ourkingdom/thomas_ash/isp_data_retention)

    My guess is that it was indeed British who brought this forward (we can't get it past out own houses of parliament so have used the EU as a sort of backdoor)

    Interestingly this directive was apparently brought up as some sort of corporate rather than legal directive, thereby bypassing even more possibilities to vote on it within the EU itself - sneaky bastards

    It is by no means popular throughout Europe, the Swedes have already stated that they will refuse to follow the directive at all

    The whole thing, who raised it, how it was tabled and who voted on it deserves more investigation

    link to this | view in thread ]

  2. identicon
    Anonymous Coward, 8 Apr 2009 @ 4:45am

    While requiring ISPs and telephony operators to log everything is bad, the even more creepy part is what you find by reading the only existing draft statutory instrument about data retention: http://www.opsi.gov.uk/si/si2009/draft/ukdsi_9780111473894_en_1

    Cutting it short, the problem lies in point 2(e)(iii), which states the following:
    "In these Regulations 'public communications provider' means-
    (i) a provider of a public electronic communications network, or
    (ii) a provider of a public electronic communications service
    and 'public electronic communications network' and 'public electronic communications service' have the meaning given in section 151 of the Communications Act 2003(a).

    Off we go to check out section 151 of the Communications Act 2003(a) (which can be found at http://www.opsi.gov.uk/ACTS/acts2003/ukpga_20030021_en_15#pt2-ch1-pb28-l1g151):
    "'public electronic communications network' means an electronic communications network provided wholly or mainly for the purpose of making electronic communications services available to members of the public;
    'public electronic communications service' means any electronic communications service that is provided so as to be available for use by members of the public;"

    These definitions are rather vague and they could be easily interpreted in such a way that would make sharing your Internet connection with your neighbour or running a Tor relay fall under the jurisdiction of the data retention directive. If that becomes a reality, then characterizing this as "overreaching" is just an understatement.

    link to this | view in thread ]

  3. identicon
    Anonymous Coward, 8 Apr 2009 @ 5:15am

    more data is better data

    it just requires the right kind of analysis.

    unless you are attempting to say that there is absolutely zero value in the data, the more of it you have the better. In fact, the more of it there is, the better it is for privacy as well, because it means that most of the use of the data will be used/parsed/viewed through automated tools.

    link to this | view in thread ]

  4. identicon
    Weirdness Herald, 8 Apr 2009 @ 5:31am

    Re: more data is better data

    No, more data is NOT better. More GOOD data is better.

    If I want to know how many people prefer Coke to Pepsi, having a database of migratory swallow patterns doesn't help AT ALL.

    More data is not better. More good data is better.

    link to this | view in thread ]

  5. identicon
    Anonymous Coward, 8 Apr 2009 @ 5:35am

    Re: more data is better data

    Actually, no, more data isn't better data and the kind of analysis doesn't matter. The reason being that the data you are adding unrelated to what you're looking for, is almost pure noise, and it comes in such great volumes that it precludes deep analysis by even automated means.

    link to this | view in thread ]

  6. identicon
    Paul G, 8 Apr 2009 @ 5:38am

    So, following the logic of AC's post(comment 2) it would appear that all the numbskulls who install a shiny new ADSL wireless router out of the box and are amazed that their PC connected straight away will be open to prosecution because they don't know that their wireless connection is unencrypted.

    Try a quick wardrive in most areas and you will find lots of open routers out there. I did a test and found SIX on a one mile stretch of road. ALL of those people are potential court cases waiting to happen as ignorance is not a defence that stands up in court.

    A large percentage of the population don't know about/understand wireless security and are happy that it 'just works' when they install it.

    link to this | view in thread ]

  7. identicon
    SteveS, 8 Apr 2009 @ 6:25am

    Anyone know of an ISP actually doing this?

    So the rules are in place, but is anyone following them? I got this from a friend who runs a small ISP for business users;

    "At current most ISPs don't have the required equipment in place to log user activity...

    ...It's probably going to stay that way for quite a while - unless the government starts paying for the required storage and processing (or issuing huge fines to companies who aren't complying)."

    link to this | view in thread ]

  8. identicon
    NullOp, 8 Apr 2009 @ 6:49am

    Mo' Data

    Seriously, more GOOD data is better. Think again! All the data is saved so there can be NO BETTER DATA! They will be working with all there is. Their main trouble will be the volume of data and picking out useful, meaningful patterns. It will, for the most part, all boil down to the value of "p", that is the probability that the results are meaningful. I wish them luck.

    link to this | view in thread ]

  9. identicon
    TheStuipdOne, 8 Apr 2009 @ 8:53am

    Coming Soon!

    A terrorist attack will kill a dozen people in London. The massive amount of data will be mined and worldwide anyone within 6 degrees of seperation of the terrorist will be arrested and promptly executed.

    Hmmm .... does getting spammed by the same spammer qualify as a 1st degree seperation?

    link to this | view in thread ]

  10. identicon
    Anonymoose, 8 Apr 2009 @ 10:10am

    When looking for a needle in a haystack...

    it's always more efficient to make the haystack as large as is technically possible...

    link to this | view in thread ]

  11. identicon
    femtobeam, 9 Apr 2009 @ 5:53am

    Data Retention and Network Security

    Finally, the EU gets it. They need time to find a sorting mechanism to extract useful data and catch the "wrongdoers". They will need more time than one year. If they find any of mine, will they please send it to the FBI so they can correct their falsified records. I can't seem to get in touch with them, indirectly. Thanks.
    "In the future, we will all die from hearsay"

    link to this | view in thread ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.