There Is No Such Thing As Anonymized Data, Google

from the barely-appeasing dept

With the news out that Google and Viacom have come to an agreement to "anonymize" the data a judge ordered Google to hand over, it's worth remembering a simple, but important statement: there's no such thing as a truly anonymized dataset. While it may protect some users, it's still likely to reveal some users and what they surfed. Given all of this, it's still quite unclear why Viacom needs this data in the first place. The legal question is whether Google infringed on copyright. Why should Google's log files be necessary to determine that?
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: anonymized data, logfiles
Companies: google, viacom, youtube


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • icon
    Zaven (profile), 15 Jul 2008 @ 4:41pm

    Confused by your statement

    Wouldn't anonymized data just be giving the data to Viacom without giving them the ip addresses. i.e. The sheet with information would have columns for various things, and the ip address column would be blanked out.

    link to this | view in chronology ]

    • identicon
      Brian, 15 Jul 2008 @ 5:02pm

      Re: Confused by your statement

      Confused? See:

      http://www.techdirt.com/articles/20071130/114005.shtml
      http://www.techdirt.com/articles/200 60807/0219238.shtml

      Glance over the comments in each article for additional explanations.

      Figuring out who information is associated with isn't terribly difficult.

      link to this | view in chronology ]

    • identicon
      Paul, 15 Jul 2008 @ 5:03pm

      Re: Confused by your statement

      He's saying that its still possible to find out who a lot of people are in the logs. Remember the AOL search results fiasco? They supposedly didn't have identifiable information, yet a lot of people were able to figure out who some of the search results were for.

      link to this | view in chronology ]

    • identicon
      Anonymous Coward, 15 Jul 2008 @ 6:05pm

      Re: Confused by your statement

      It is doubtful that they would remove the IP addresses. Instead it is likely that they would create a hash of the individual IPs or something similar. That way the IP address would not be revealed but Viacom can still differentiate between unique viewers.

      link to this | view in chronology ]

  • identicon
    Paul, 15 Jul 2008 @ 5:05pm

    well

    Not that I agree with it, but the supposed reason Viacom wants the logs is to prove that a majority of YouTube's use (well, maybe not majority, not sure what portion they have to prove) is for their copyrighted material. Even if it is, I'm still not sure it means anything, but it will mean a lot for them especially if the judge isn't too technologically savvy.

    link to this | view in chronology ]

    • identicon
      Paul, 15 Jul 2008 @ 5:08pm

      Re: well

      Also, I just want to add that I've also looked at this from the point of view that Viacom knows they're wrong, but they want the money anyway. I think they're going to try to pull technicalities and loop holes. In all honesty, I think its clear that YouTube is protected by the safe harbor clause, but Viacom for whatever reason is apparently trying to one-up the RIAA in suing the wrong people and ruining their reputation. The big shots are too short-sighted and see all the money they could make off of this quickly, but they don't see how they're ruining their business. It seems like a common methodology these days from new business school folks. Maximize the amount of money you can make, regardless of how much you potentially could make in the long run.

      link to this | view in chronology ]

  • identicon
    SVContrarian, 15 Jul 2008 @ 5:23pm

    It's called a subpoena

    If it's unclear to TechDirt why Viacom needs this data, I suggest your unfailing love of Google has clouded your judgement. It's called a subpoena, read about them. They happen occasionally in lawsuits. When Microsoft had to turn over their email, logs, etc. to the Justice Department, haters everywhere were delighted. But heaven forbid someone accuse our beloved Google! Truth is, there's a pretty good chance Google's making money on other people's copyrighted content. Ahem... you may not LIKE the copyright laws, but be prepared to face the consequences when you violate them on a massive scale. If the court finds in favor of Viacom (and this data will be key to proving just how deep the infringement was and how complicit Google employees were), then cha-ching Viacom.

    link to this | view in chronology ]

    • identicon
      Ljlego, 15 Jul 2008 @ 6:08pm

      Re: It's called a subpoena

      Before I start this, I must confess that I am not a lawyer. Chances are, you aren't either. One imagines that the extent of your law training is watching the occasional JAG or Law & Order (because who watches JAG anyway?). I could be wrong, I am also not clairvoyant.

      That said, I find your statement to be incredibly unintelligent. To say that Google (the entity) violated copyright laws is absurd. As Google does not provide the videos directly, but instead provides a place to host them. To insinuate that Google was implicit in the actual copyright violation makes very little sense, as YouTube hosts at least 80 million videos.

      To say that Google (the individuals who make up the entity) violated copyright laws is even more absurd. Criminal law requires motive and intent to prosecute for a crime (ostensibly). Civil courts are much more open in that regard, but there still must be some proof. To prove that someone within the Google company, or that many individuals at high levels within Google conspired to infringe on copyright, would require proof that most likely isn't there to begin with.

      You are correct. Google is legally bound by the subpoena to turn over whatever the subpoena asks for. Techdirt doesn't question that, unless they're much stupider than I believe they are. They are questioning whether the subpoena was legal, or necessary, or grounded in proof.

      ADDENDUM: How in the hell will finding out which videos I watched prove how complicit Google employees were? In fact, without a copyright notice, there is no way to know whether the copyright is being infringed from a simple YouTube video. Sure, one can guess, but then, one can guess who a murderer is too.

      link to this | view in chronology ]

      • identicon
        SVContrarian, 15 Jul 2008 @ 9:25pm

        Re: Re: It's called a subpoena

        Why is it so hard for all the fanboys to grasp that Google just might have violated copyright laws? That's what this case will uncover. Seems like the court already found enough evidence to grant the Viacom subpoena. So they're giving Viacom some leeway to fish around.

        The article naively asks why logs are relevant. Get real. Yes, the legal issue is copyright, so yes, Google's own logs are needed to prove the case. Download establishes how they're monetizing/profiting from viewership of copyrighted material, upload by employees shows they're complicit. Logs are needed to tell the story. Duh...

        Can't we just stop with the fake privacy outrage?

        Google = good and trustworthy? Viacom = evil because they're asserting ownership of their content and protecting their business model? That's a load.

        BTW - the bar for whether Google's complicit is quite low - if Google employees posted anything that Viacom owned (again, back to the logs, this time with Google employee data), they're pretty much toast. No, Sergey doesn't have to sign off on each SouthPark episode for them to be liable. If Viacom finds evidence of Google employees seeding the site, rest assured Google will be writing a big fat check, and Viacom will deserve every penny. And then Google can kiss their DMCA protection goodbye too. Should be fun to watch.

        link to this | view in chronology ]

        • identicon
          Brooks, 16 Jul 2008 @ 1:19am

          Re: Re: Re: It's called a subpoena

          What law school did you go to, again? Because they seem to have a fairly shoddy program, in that they have miseducated you about IP laws *and* didn't manage to train you out of the childish habit of name-calling.

          Fact is, if Google systematically posted copyrighted content, or turned a blind eye to an employee doing so, there could be serious repercussions. But you've taken that fact and extrapolated out so far that it's just silly.

          Also, DMCA protections don't exist in some concrete sense; they can't be revoked in general. In any particular case, they can be attacked, as when the plaintiff alleges that the defendant was acting as more than a conduit and host for the material in question. Which, absolutely, is what this case is about. But even if Viacom wins, it doesn't ean Googel in general loses all benefit of the DMCA safe harbor provisions. It just doesn't work that way.

          Please do let us all know what law school you graduated from. 'Cause I, for one, am never working with anyone who went there.

          link to this | view in chronology ]

        • identicon
          Paul, 16 Jul 2008 @ 8:22am

          Re: Re: Re: It's called a subpoena

          Viacom can assert ownership and HAS. It's called a Takedown notice. However, they've even gone so far as to break the law by serving takedown notices of things they do *not* have the copyright for. That's actually against the law. No one has said anything because its always the little people who get caught in the net incorrectly. Google is completely within the terms of safe harbor. I've actually read the law itself, not just interpretations. Its absolutely CLEAR. YouTube meets every requirement. This lawsuit is a farce. Thats why people think Viacom is evil. It's tying up court systems and getting people like you to think that things that are legal, are actually illegal.

          link to this | view in chronology ]

        • identicon
          Anonymous Coward, 16 Jul 2008 @ 10:11pm

          Re: Re: Re: It's called a subpoena

          Download establishes how they're monetizing/profiting from viewership of copyrighted material, upload by employees shows they're complicit.
          And so just how are "anonymized" logs supposed to show who was uploading?

          link to this | view in chronology ]

    • icon
      Mike (profile), 16 Jul 2008 @ 1:04am

      Re: It's called a subpoena

      If it's unclear to TechDirt why Viacom needs this data, I suggest your unfailing love of Google has clouded your judgement. It's called a subpoena, read about them.

      Um, we know quite a bit about how subpoenas work. The point was that it seemed like an unreasonable subpoena. Since you claim to know so much about subpoenas that the rest of us do not, I would imagine you would know that you don't just get to subpoena anything you want and automatically get it. You have to show a good reason for it. My point was that it's unclear what that "good reason" is.

      When Microsoft had to turn over their email, logs, etc. to the Justice Department, haters everywhere were delighted.

      Well, that was quite a different situation -- and, actually, contrary to your assertion, I wasn't delighted.

      But, more to the point, that was information that was directly relevant to the case. It is still unclear why user log data was relevant to the case.

      But heaven forbid someone accuse our beloved Google!

      Um. I'm hardly a Google-lover. I've come down pretty hard on the company when it does stupid things. But why let a little thing like reality come between you and a good rant?

      Truth is, there's a pretty good chance Google's making money on other people's copyrighted content.

      No. Google is making money providing a service. The fact that some *users* make use of that service to infringe on copyright is an issue between Viacom and those users.

      Ahem... you may not LIKE the copyright laws, but be prepared to face the consequences when you violate them on a massive scale.

      Sure. We agree. But that would require Google to have violated the law.

      If the court finds in favor of Viacom (and this data will be key to proving just how deep the infringement was and how complicit Google employees were), then cha-ching Viacom.

      If the point of the data were merely to show how complicit Google employees are, then the subpoena should have *only* covered the user accounts of Google employees.

      Why does it need everyone else's data?

      Furthermore, the complicitness of Google employees alone may not be enough. If it's the Google janitor who is uploading The Colbert Report, it's unlikely that that is evidence that Google itself was complicity.

      link to this | view in chronology ]

  • icon
    GeneralEmergency (profile), 15 Jul 2008 @ 5:35pm

    I can't stand the pressure...must confess.

    Yes, Viacom, it was -me- who watched that one episode of Beverly Hillbillies, "Elly Comes Out", 19,478 times.

    There. I said it. Now everyone knows.

    By the way, that title is -very- misleading.

    link to this | view in chronology ]

  • icon
    Allen (profile), 15 Jul 2008 @ 7:53pm

    Since when

    I've seen other articles that are suggesting that them may want to demonstrate that Google (or Google Employees) knew some copyrighted content was there and that Google did not act to remove it. If they can show it in some cases, they can extrapolate that there were others (they're good at that). All which might lead to some kind of settlement.

    Not loosing is as good as winning.

    link to this | view in chronology ]

  • identicon
    Dr.A, 15 Jul 2008 @ 8:02pm

    Uploaders

    Was it just me who noticed that piece of news where viacom asked to see specificaly the videos uploaded by people@google ? Maybe they want to prove that much of the material was uploaded by google to atract people. If the videos are not user generated but google provided ...

    link to this | view in chronology ]

  • identicon
    scotus, 15 Jul 2008 @ 8:04pm

    United States v Rumely (1953)

    MR. JUSTICE FRANKFURTER delivered the opinion of the Court.

    The respondent Rumely was Secretary of an organization known as the Committee for Constitutional Government, which, among other things, engaged in the sale of books of a particular political tendentiousness. He refused to disclose to the House Select Committee on Lobbying Activities the names of those who made bulk purchases of these books for further distribution, and was convicted under R. S. 102, as amended, which provides penalties for refusal to give testimony or to produce relevant papers "upon any matter" under congressional inquiry. The Court of Appeals reversed, one judge dissenting. It held that the committee before which Rumely refused to furnish this information had no authority to compel its production. Since the Court of Appeals thus took a view of the committee's authority contrary to that adopted by the House in citing Rumely for contempt, we granted certiorari....

    . . . .

    MR. JUSTICE DOUGLAS, with whom MR. JUSTICE BLACK concurs, concurring....

    If the present inquiry were sanctioned, the press would be subjected to harassment that in practical effect might be as serious as censorship. A publisher, compelled to register with the Federal Government, would be subjected to vexatious inquiries. A requirement that a publisher disclose the identity of those who buy his books, pamphlets, or papers is indeed the beginning of surveillance of the press. True, no legal sanction is involved here. Congress has imposed no tax, established no board of censors, instituted no licensing system. But the potential restraint is equally severe. The finger of government leveled against the press is ominous. Once the government can demand of a publisher the names of the purchasers of his publications, the free press as we know it disappears. Then the spectre of a government agent will look over the shoulder of everyone who reads. The purchase of a book or pamphlet today may result in a subpoena tomorrow. Fear of criticism goes with every person into the bookstall. The subtle, imponderable pressures of the orthodox lay hold. Some will fear to read what is unpopular, what the powers-that-be dislike. When the light of publicity may reach any student, any teacher, inquiry will be discouraged. The books and pamphlets that are critical of the administration, that preach an unpopular policy in domestic or foreign affairs, that are in disrepute in the orthodox school of thought will be suspect and subject to investigation. The press and its readers will pay a heavy price in harassment. But that will be minor in comparison with the menace of the shadow which government will cast over literature that does not follow the dominant party line. If the lady from Toledo can be required to disclose what she read yesterday and what she will read tomorrow, fear will take the place of freedom in the libraries, book stores, and homes of the land. Through the harassment of hearings, investigations, reports, and subpoenas government will hold a club over speech and over the press. Congress could not do this by law....

    United States v Rumely (1953)

    link to this | view in chronology ]

  • identicon
    Louise, 16 Jul 2008 @ 3:06am

    Privacy On The Web

    Whatever the outcome the Google/Viacom case is opening people's eyes to both copyright infringement on the web and dare I say it...privacy.
    Here's a blogpost on just that:

    http://passpack.wordpress.com/2008/07/15/defining-privacy-on-the-web/

    Louise

    link to this | view in chronology ]

  • identicon
    Twinrova, 16 Jul 2008 @ 4:13am

    So much fuss over "privacy".

    I'm going to put this as blunt as possible: The moment you signed up to YouTube is the moment you have no privacy.

    Giving any website any information, including a username, is no different than walking up to someone you don't even know and saying "Hi, here's my address!"

    Any internet user should always assume the info submitted to any website will have a chance to be released to parties outside of the venue to which they've signed.

    Even here at Techdirt, your IP and username information is subject to loss should anyone apply a court order for it.

    People are upset about the uncertainty of Viacom's use of the data, but shouldn't the question to ask yourself is "Why should I worry?", especially if you've done nothing but post videos about your cat's recent antics.

    Please don't reply stating "But I trusted Google's privacy statement!" because it's a moot point. Google didn't force you to sign up to use their free service, sell your information, or violated their own policy.

    A stupid judge made a stupid decision and now everyone's acting even more stupid.

    You would think people would understand PRIVATE DATA IS NO LONGER PRIVATE. Data "leaks", court ordered turnovers, or even malicious attacks to retrieve it only prove privacy online is a misconception and why data can never be anonymized.

    Disagree if you must, but deep down, you know this to be true.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 16 Jul 2008 @ 10:26pm

      Re: So much fuss over "privacy".

      Even here at Techdirt, your IP and username information is subject to loss should anyone apply a court order for it.
      Which is why I use Tor.

      link to this | view in chronology ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.