from the transparency-reporting-on-content-moderation dept
On February 2nd, Santa Clara University is hosting a gathering of tech platform companies to discuss how they actually handle content moderation questions. Many of the participants have written short essays about the questions that will be discussed at this event -- and over the next few weeks we'll be publishing many of those essays, including this one.
In the wake of ongoing concerns about online harassment and harmful content, continued terrorist threats, changing hate speech laws, and the ever-growing user bases of major social media platforms, tech companies are under more pressure than ever before with respect to how they treat content on their platforms—and often that pressure is coming from different directions. Companies are being pushed hard by governments and many users to be more aggressive in their moderation of content, to remove more content and to remove it faster, yet are also consistently coming under fire for taking down too much content or lacking adequate transparency and accountability around their censorship measures. Some on the right like Steve Bannon and FCC Chairman Ajit Pai have complained that social media platforms are pushing a liberal agenda via their content moderation efforts, while others on the left are calling for those same platforms to take down more extremist speech, and free expression advocates are deeply concerned that companies' content rules are so broad as to impact legitimate, valuable speech, or that overzealous attempts to enforce those rules are accidentally causing collateral damage to wholly unobjectionable speech.
Meanwhile, there is a lot of confusion about what exactly the companies are doing with respect to content moderation. The few publicly available insights into these processes, mostly from leaked internal documents, reveal bizarrely idiosyncratic rule sets that could benefit from greater transparency and scrutiny, especially to guard against discriminatory impacts on oft-marginalized communities. The question of how to address that need for transparency, however, is difficult. There is a clear need for hard data about specific company practices and policies on content moderation, but what does that look like? What qualitative and quantitative data would be most valuable? What numbers should be reported? And what is the most accessible and meaningful way to report this information?
Part of the answer to these questions can be found by looking to the growing field of transparency reporting by internet companies. The most common kind of transparency report that companies voluntarily publish gives detailed numbers about government demands for information about the companies’ users—showing, for example, how many requests were received, from what countries or jurisdictions, what kind of data was requested, and whether they were complied with or not. As reflected in this history of the practice published by our organization, New America’s Open Technology Institute (OTI), transparency reporting about government demands for data has exploded over the past few years, so much so that projects like the Transparency Reporting Toolkit by OTI and Harvard’s Berkman-Klein Center for Internet & Society have emerged to try and define consistent standards and best practices for such reporting. Meanwhile, a decent number of companies have also started publishing reports about the legal demands they receive for the takedown of content, whether copyright-based or otherwise.
However, almost no one is publishing data about what we're talking about here: voluntary takedowns of content by companies based on their own terms of service (TOS). Yet especially now, as private censorship gets even more aggressive, the need for transparency also increases. This need has led to calls from a variety of corners for companies to report on content moderation. For example, a working group of the Freedom Online Coalition, composed of representatives from industry, civil society, academia, and government, called for meaningful transparency about companies’ content takedown efforts, complaining that “there is very little transparency” around TOS enforcement mechanisms. The 2015 Ranking Digital Rights Corporate Accountability Index found that every company surveyed received a failing grade with respect to reporting on TOS-based takedowns; the 2017 Index findings fared only slightly better. Finally, David Kaye, the United Nations Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression, called for companies to “disclose their policies and actions that implicate freedom of expression.” Specifically, he observed that “there are … gaps in corporate disclosure of statistics concerning volume, frequency and types of request for content removals and user data, whether because of State-imposed restrictions or internal policy decisions.”
The benefits to companies issuing such transparency reports around their content moderation activities would be significant: For those companies under pressure to “do something” about problematic speech online, this is a an opportunity to outline the lengths to which they have gone to do just that; for companies under fire for “not doing enough,” a transparency report would help them express the size and complexity of the problems they are addressing, and explain that there is no magic artificial intelligence wand they can wave and make online extremism and harassment disappear; and finally, public disclosure about content moderation and terms of service practices will go a long way toward building trust with users—a trust that has crumbled in recent years. Putting aside the benefit to companies, though, there is the even more significant need of policymakers and the public. Before we can have an intelligent conversation about hate speech, terrorist propaganda, or other worrisome content online, or formulate fact-based policies about how to address that content, we need hard data about the breadth and depth of those problems, and about the platforms' current efforts to solve those problems.
While there have been calls for publication of such information, there has been little specificity with respect to what exactly should be published. No doubt this is due, in great part, to the opacity of individual companies’ content moderation policies and processes: It is difficult to identify specific data that would be useful without knowing what data is available in the first place. Anecdotes and snippets of information from companies like Automattic and Twitter offer a starting point for considering what information would be most meaningful and valuable. Facebook has said they are entering a new of era transparency for the platform. Twitter has published some data about content removed for violating its TOS, Google followed suit for some of the content removed from YouTube, and Microsoft has published data on “revenge porn” removals. While each of these examples is a step in the right direction, what we need is a consistent push across the sector for clear and comprehensive reporting on TOS-based takedowns.
Looking to the example of existing reports about legally-mandated takedowns, data that shows the scope and volume of content removals, account removals, and other forms of account or content interference/flagging would be a logical starting point. Information about content that has been flagged for removal by a government actor—such as the U.K.’s Counter Terrorism Internet Referral Unit, which was granted “super flagger” status on YouTube, allowing the agency to flag content in bulk—should also be included, to guard against undue government pressure to censor. More granular information, such as the number of takedowns in particular categories of content (whether sexual content, harassment, extremist speech, etc.), or specification of the particular term of service violated by each piece of taken-down content, would provide even more meaningful transparency. This kind of quantitative data (i.e., numbers and percents) would be valuable on its own, but would be even more helpful if paired with qualitative data to shed more light on the platforms’ opaque content moderation practices and tell users a clear story about how those processes actually work, using compelling anecdotes and examples.
As has already and often happened with existing transparency reports, this data will help keep companies accountable. Few companies will want to demonstrably be the most or least aggressive censor, and anomalous data such as huge spikes around particular types of content will be called out and questioned by one stakeholder group or another. It will also help ensure that overreaching government pressure to takedown more content is recognized and pushed back on, just as in current reporting it has helped identify and put pressure on countries making outsized demands for users’ information. And most importantly, it will help drive policy proposals that are based on facts and figures rather than on emotional pleas or irrational fears—policies that hopefully will help make the internet a safer space for a range of communities while also better protecting free expression.
Unquestionably, the major platforms have become our biggest online gatekeepers when it comes to what we can and cannot say. Whether we want them to have that power or not, and whether we want them to use more or less of that power in regard to this or that type of speech, are questions we simply cannot answer until we have a complete picture of how they are using that power. Transparency reporting is our first and best tool for gaining that insight.
Kevin Bankston is the Director of the Open Technology Institute at New America). Liz Woolery is Senior Policy Analyst at the Open Technology Institute at New America.
Filed Under: censorship, content moderation, due process, filters, platforms, transparency