from the data-data-data dept
As part of Google's ongoing
Transparency Report efforts, today the company has released a whole new section on
copyright takedowns, containing
a huge amount of information on the many takedown requests Google receives. It focuses specifically on the takedowns for
search links, but I wouldn't be surprised to see them add other areas later. As you may recall, we were among those who were
victimized by a bogus takedown, and a key post about SOPA that we had written was missing from Google search for about a month.
The new transparency platform lets you dig in and see quite a few details about exactly
who is issuing takedowns and what they're removing from search. It's using data since last July (when Google set up an organized web-form, so the data is consistent). It may be a bit surprising, but at the top of the list?
Microsoft, who has apparently taken down over
2.5 million URLs from Google's search results. Most of the the others in the top 10 aren't too surprising. There's NBC Universal at number two. The RIAA at number three (representing all its member companies). BPI at number five. Universal Music at number seven. Sony Music at number eight. Warner Music doesn't clock in until number 12.
There's also data on which sites are most frequently
targeted, which (not surprisingly) lists out a bunch of torrent search sites and file lockers and such. Don't be surprised to see some try to claim that this is an accurate list of "rogue sites" that Google should block entirely. However, if you look carefully at the data, Google also highlights the
percentage of pages on those sites for which they've received takedowns, and the vast majority of them are well below 1%. In other words, no one has complained about well over 99% of the pages on these sites. It seems pretty drastic to suggest that these sites are obviously nothing but evil, when so many of their pages don't seem to receive any complaints at all.
Perhaps more important, however, is that Google is also revealing the incredible
deluge of takedown requests it receives in search, each of which it tries to check to make sure they're legitimate. As it stands now, Google is processing
over 250,000 such requests per week -- which is more than they got
in the entire year of 2009. For all of 2011, Google receive 3.3 million copyright takedowns for search... and here we are in just May of 2012, and they're already processing over 1.2 million
per month. And while we've heard reports from the usual Google haters that Google is slow to respond to takedowns, it says that its average turnaround time last week was 11 hours. Think about that for a second. It's reviewing each one of these takedowns, getting 250,000 per week... and can still process them in less than 12 hours. That's pretty impressive.
It's also interesting to hear that these reviews catch some pretty flagrant bogus takedown requests:
At the same time, we try to catch erroneous or abusive removal requests. For example, we recently rejected two requests from an organization representing a major entertainment company, asking us to remove a search result that linked to a major newspaper’s review of a TV show. The requests mistakenly claimed copyright violations of the show, even though there was no infringing content. We’ve also seen baseless copyright removal requests being used for anticompetitive purposes, or to remove content unfavorable to a particular person or company from our search results.
It's good to see Google catch these, as plenty of other sites would automatically take such content down, just to avoid any question of liability. Of course, it doesn't catch them all. Some get through -- as we ourselves discovered a few months ago. That led us to wonder if this tool could drill down and find the details about takedowns targeting Techdirt,
but unfortunately at the moment there doesn't seem to be any way to actually search the list. Hopefully that will change soon.
Update: The search function is not currently advertised anywhere, but you can access it by using a URL:
http://www.google.com/transparencyreport/removals/copyright/domains/yourdomain.com/
Of course, this is also a good reminder -- as they note in the Google blog post -- that if you run a website, you should absolutely sign up to use Google's
Webmaster tools, which will quickly inform you when one of your URLs are targeted by such a takedown, allowing you to easily file a counternotice.
Either way, this is really fascinating data and an interesting platform, shedding some significant light on just how often copyright holders are trying to take links out of Google, who's doing it and who they're targeting.
Filed Under: data, dmca, takedown, transparency
Companies: bpi, google, microsoft, nbc universal, riaa, sony music, universal music, warner music