Upload Filters And The Internet Architecture: What's There To Like?

from the absolutely-nothing dept

Mon, Nov 16th 2020 12:11pm — Konstantinos Komaitis and Farzaneh Badiei

In August 2012, YouTube briefly took down a video that had been uploaded by NASA. The video, which depicted a landing on Mars, was caught by YouTube’s Content ID system as a potential copyright infringement case but, like everything else NASA creates, it was in the public domain. Then, in 2016, YouTube’s automated algorithms removed another video, this time a lecture by a Harvard Law professor, which included snippets of various songs ranging from 15 to roughly 40 seconds. Of course, use of copyright for educational purposes is perfectly legal. Examples of unwarranted content takedowns are not limited to only these two. Automated algorithms have been responsible for taking down perfectly legitimate content that relates to marginalized groups, political speech or the mere existence of information that relates to war crimes.

But, the over-blocking of content through automated filters is only one part of the problem. A few years ago, automated filtering was somewhat limited in popularity, being used by a handful of companies; but, over the years, they have become increasingly the go-to technical tool for policy makers wanting to address any content issue -- whether it is copyrighted or any other form of objectionable content. In particular, in the last few years, Europe has been championing upload filters as a solution for the management of content. Although never explicitly mentioned, upload filters started appearing as early as 2018 in various Commission documents but became a tangible policy tool in 2019 with the promulgation of the Copyright Directive.

Broadly speaking, upload filters are technology tools that platforms, such as Facebook and YouTube, use to check whether content published by their users falls within any of the categories for objectionable content. They are not new - YouTube’s Content ID system dates back to 2007; they are also not cheap - YouTube’s Content ID has cost a reported $100 million to make. Finally, they are ineffective as machine learning tools will always over-block or under-block content.

But, even with these limitations, upload filters continue to be the preferred option for content policy making. Partly, this is due to the fact that policy makers depend on online platforms to offer technology solutions that can scale and can moderate content en masse. Another reason is that elimination of content and take-downs is perceived to be easier and has an instant effect. In a world where more than 500 hours of content are uploaded hourly on YouTube or 350 million photos are posted daily on Facebook, technology solutions such as upload filters appear more desirable than the alternative of leaving the content up. A third reason is the computer-engineering bias of the industry. What this means is that typically when you build programmed systems, you follow a pretty-much predetermined route: you identify a gap, build something to fill that gap (and, hopefully, in the process make money at it) and, then you iteratively fix bugs in the program as they are uncovered. Notice that in this process, the question of whether the problem is best solved through building a software is never asked. This has been the case with the ‘upload filters’ software.

As online platforms become key infrastructure for users, however, the moderation practices they adopt are not only about content removal. Through such techniques, online platforms undertake a governance function, which must ensure the productive, pro-social and lawful interaction of their users. Governments have depended on platforms carrying out this function for quite some time but, over the past few years, they have become increasingly interested in setting the rules for social network governance. To this end, there seems to be a trend of several new regional and national policies that mandate upload filters for content moderation.

What is at stake?

The use of upload filters and the legislative efforts to promote them and make them compulsory is having a major effect on Internet infrastructure. One of the core properties of the Internet is that it is based on an open architecture of interoperable and reusable building blocks. In addition to this open architecture, technology building blocks work together collectively to provide services to end users. At the same time, each building block delivers a specific function. All this allows for fast and permissionless innovation everywhere.

User-generated-content platforms are now inserting deep in their networks automated filtering mechanisms to deliver services to their users. Platforms with significant market power have convened a forum called the Global Internet Forum to Counter Terrorism (GIFCT), through which approved participants (but not everyone) collaborate to create shared upload filters. The idea is that these filters are interoperable amongst platforms, which, prima facie, is good for openness and inclusiveness. But, allowing the design choices of filters to be made by a handful of companies turns them into de facto standards bodies. This provides neither inclusivity nor openness. To this end, it is worrisome that some governments appear keen to empower and perhaps anoint this industry consortium as a permanent institution for anyone who accepts content from users and republishes it. In effect, this makes an industry consortium, with its design assumptions, a legally-required and permanent feature of Internet infrastructure.

Convening closed consortiums, like the GIFCT, combined with governments’ urge to make upload filters mandatory can violate some of the most important Internet architecture principles: ultimately, upload filters are not based on collaborative, open, voluntary standards but on closed, proprietary ones, owned by specific companies. Therefore, unlike traditional building blocks, these upload filters end up not being interoperable. Smaller online platforms will need to license them. New entrants may find the barriers to entry too high. This, once again, tilts the scales in favor of large, incumbent market players and disadvantages an innovator with a new approach to these problems.

Moreover, mandating GIFCT tools or any other technology, determines the design assumptions underpinning that upload filter framework. Upload filters function as a sort of panopticon device that is operated by social media companies. But, if the idea is to design a social media system that is inherently resistant to this sort of surveillance, then upload filters are not going to work because the communications are protected from users. In effect, that means that mandating GIFCT tools, further determines what sort of system design is acceptable or not. This makes the regulation invasive because it undermines the "general purpose" nature of the Internet, meaning some purposes get ruled out under this approach.

The current policy objective of upload filters is twofold: regulating content and taming the dominance by certain players. These are legitimate objectives. But, as technology tools, upload filters fail on both counts: not only do they have limitations in moderating content effectively, but they also cement the dominant position of big technology companies. Given the costs of creating such a tool and the requirement for online platforms to have systems that ensure the fast, rigorous and efficient takedown of content, there is a trend emerging where smaller players depend on the systems of bigger ones.

Ultimately, upload filters are imperfect and not even an effective solution to our Internet and social media governance problems. They don’t reduce the risk of recidivism and only eliminate the problems, not their recurrence. Aside from the fact that upload filters cannot solve societal problems, mandated upload filters can adversely affect Internet architecture. Generally, the Internet’s architecture can be impacted by unnecessary technology tools, like deep packet inspection, DNS blocking or upload filters. These tools produce consequences that run counter to the benefits expected by the Internet: they compromise its flexibility and do not allow the Internet to continuously serve a diverse and constantly evolving community of users and applications. Instead, they require significant changes to the networks in order to support their use.

Overall, there is a real risk that upload filters become a permanent feature of the Internet architecture and online dialogue. This is not a society that any of us should want to live in - a society where speech is determined by software that will never be able to grasp the subtlety of human communication.

Konstantinos Komaitis is the Senior Director, Policy Strategy at the Internet Society

Farzaneh Badiei is the Director of the Social Media Governance Initiative at Yale Law School.

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: article 17, contentid, copyright directive, eu, overblocking, upload filters

16 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

crade (profile), 16 Nov 2020 @ 2:53pm

"Notice that in this process, the question of whether the problem is best solved through building a software is never asked"

This is never asked because it's an impossible and irrelevant question. Whether someone else with different tools and experience than you might hypothetically be able to solve the problem in a better way if they ever get off their thumbs is not a productive question.
[ link to this | view in thread ]
Tanner Andrews (profile), 17 Nov 2020 @ 5:20am

if particular filters are mandated
Normally, a mandate takes the form of requiring some set of acts to be taken or not taken. This may agree with the acts involved in the GIFCT filters, but in order to be mandated consistent with due process, those particular acts ought to be spelled out.

Requiring ``Microsoft Bob'' or ``Microsoft Office'' compatibility is opaque. Likewise, mandating ``GIFCT Filtering'' is opaque: how is someone to know if his filters comply?
[ link to this | view in thread ]
Konstantinos Komaitis, 17 Nov 2020 @ 6:11am

Correction
There's a mistake in the article -- it says "500 hours of content are uploaded hourly", whereas it should be "each minute".
[ link to this | view in thread ]
nasch (profile), 17 Nov 2020 @ 10:14am

Re:

Whether someone else with different tools and experience than you might hypothetically be able to solve the problem in a better way if they ever get off their thumbs is not a productive question.

Are you saying "all we know how to do is write software, so we will attempt to solve every problem by writing software whether that will work or not" is an appropriate approach? Or have I misunderstood?
[ link to this | view in thread ]
Code Monkey (profile), 17 Nov 2020 @ 2:34pm

Content moderation? Say it ain't so...
The idea is that these filters are interoperable amongst platforms, which, prima facie, is good for openness and inclusiveness. But, allowing the design choices of filters to be made by a handful of companies turns them into de facto standards bodies. This provides neither inclusivity nor openness.

Just like Facebook and Twitter picking and choosing who is allowed to post opinions online provides neither inclusivity nor openness.

Got it.
[ link to this | view in thread ]
nasch (profile), 17 Nov 2020 @ 2:48pm

Re: Content moderation? Say it ain't so...
Just like Facebook and Twitter picking and choosing who is allowed to post opinions on their own platforms and nowhere else is protected by the first amendment right to freedom of expression and private property rights.

FTFY
[ link to this | view in thread ]
Stephen T. Stone (profile), 17 Nov 2020 @ 3:09pm

I can’t believe some people think Twitter and Facebook either are the entire Internet or control the entire Internet.
[ link to this | view in thread ]
Code Monkey (profile), 17 Nov 2020 @ 6:08pm

Re:
I named only 2 platforms, not the entire internet.

FTFY
[ link to this | view in thread ]
nasch (profile), 17 Nov 2020 @ 6:33pm

Re: Re:
Yes, and you said that those two platforms are "picking and choosing who is allowed to post opinions online."

While posting your opinion online without the approval of either of those two platforms.
[ link to this | view in thread ]
Code Monkey (profile), 17 Nov 2020 @ 6:55pm

Re: Re: Re:
Wow. Context really eludes you, doesn't it. Your vacuous trolling know no bounds. But hey, you do you. :)
[ link to this | view in thread ]
Stephen T. Stone (profile), 17 Nov 2020 @ 7:21pm

You could’ve added more context if you didn’t want us to think you meant “the Internet” when you said “online”. But alas, no takebacks here.
[ link to this | view in thread ]
nasch (profile), 17 Nov 2020 @ 8:41pm

Re: Re: Re: Re:
It's not my fault words have meanings. If you want to be understood, use the words that mean the things that you want others to understand.
[ link to this | view in thread ]
K`Tetch (profile), 17 Nov 2020 @ 10:15pm

In August 2012, YouTube briefly took down a video that had been uploaded by NASA. The video, which depicted a landing on Mars, was caught by YouTube’s Content ID system as a potential copyright infringement case but, like everything else NASA creates, it was in the public domain.

It's still going on.
Last month I put up a NASA panel discussion+guide about moon photography (I dabble in it a fair bit)and got hit by an Adrev claim. I filed the counternotice, but it took the 30 days for it to expire (last week, on the 5th) rather than it be straight cancelled.
Very annoying.
[ link to this | view in thread ]
crade (profile), 18 Nov 2020 @ 7:01am

Re: Re:
If the best you can come up with will improve the situation and no one else is doing anything it doesn't matter if it's the best solution. "The best solution" is an unreachable goal. Someone will always come up with a better solution eventually, maybe it will be you after you try the best one you can come up with right now and learn more about the issue.

If the only one bringing anything to the table is a software company, don't blame the software company that we only have software ideas on the table.
[ link to this | view in thread ]
crade (profile), 18 Nov 2020 @ 9:24am

Re: Re:
I'm saying you don't know what you don't know. If no one else is stepping up the best way you can think of and accomplish with your knowledge and skills is the best way you are going to get and it doesn't matter if it's the best way to do it or not.
[ link to this | view in thread ]
Code Monkey (profile), 18 Nov 2020 @ 3:59pm

Re: Dumb it down to a third grade level so nasch can understand.
Duly noted.
[ link to this | view in thread ]