stories filed under: "hard choices"

Implementing Transparency About Content Moderation

from the not-as-easy-as-it-looks dept

Thu, Feb 1st 2018 1:38pm — Alex Feerst

On February 2nd, Santa Clara University is hosting a gathering of tech platform companies to discuss how they actually handle content moderation questions. Many of the participants have written short essays about the questions that will be discussed at this event -- and over the next few weeks we'll be publishing many of those essays, including this one.

When people express free speech-based concerns about content removal by platforms, one type of suggestion they generally offer is -- increase transparency. Tell us (on a website or in a report or with an informative "tombstone" left at the URL where the content used to be) details about what content was removed. This could happen lots of different ways, voluntarily or not, by law or industry standard or social norms. The content may come down, but at least we'll have a record and some insight into what happened, at whose request, and why.

In light of public discussions about platform transparency, especially in the past year, this post offers a few practical thoughts about transparency by online UGC platforms. First, some of the challenges platforms face in figuring out how to be transparent with users and the public about their content moderation processes. Second, the industry practice of transparency reports and what might be done to make them as useful as possible.

Content Moderation Processes & Decisions

So, why not be radically transparent and say everything? Especially if you're providing a service used by a substantial chunk of the public and have nothing to hide. Just post all takedown requests in their entirety and all correspondence with people seeking asking you to modify or remove content. The best place to start answering this is by mentioning some of the incentives a platform faces here and the reasonable reasons they might say less than everything (leaving aside self-interested reasons like avoiding outside scrutiny and saving embarrassment over shortcomings such as arguably inconsistent application of moderation rules or a deficient process for creating them).

First, transparency is sometimes in tension with the privacy of not just users of a service, but any person who winds up the subject of UGC. Just as the public, users, regulators, and academics are asking platforms to increase transparency, the same groups have made equally clear that platforms should take people's privacy rights seriously. The legal and public relations risks of sharing information in a way that abridges someone's privacy are often uncertain and potentially large. This does not mean they cannot be outweighed by transparency values, but I think in order to weight them properly, this tension has to be acknowledged and thought through. In particular, however anonymized a given data set is, the risks of de-anonymization increase with time as better technologies come to exist. Today's anonymous data set could easily be tomorrow's repository of personally identifiable information, and platforms are acting reasonably when choosing to safeguard these future and contingent rights for people by sometimes erring on the side of opacity around anything that touches user information.

Second, in some cases, publicizing detailed information about a particular moderation decision risks maintaining or intensifying the harm that moderation was intended to stop or lessen. If a piece of content is removed because it violates someone's privacy, then publicizing information about that takedown or redaction risks continuing the harm if the record is no carefully worded to exclude the private information. Or, in cases of harassment, it may provide information to the harasser or the public (or the harasser's followers, who might choose to join in) for that harassment to continue. In some cases, the information can be described at a sufficiently high level of generality to avoid harm (e.g., "a private person's home address was published and removed" or "pictures of a journalist's children were posted and removed). In other cases, it may be hard or impossible (e.g., "an executive at small company X was accused of embezzling by an anonymous user"). Of course, generalizing at too high a level may frustrate those seeking greater transparency as not much better than not releasing the information at all.

Finally, in some cases publicizing the details a moderation team's script or playbook can make the platform's rules easier to break or hack by bad faith actors. I don't think these are sufficient reason to perpetuate existing confidentiality norms. But, if platform companies are being asked or ordered to increase the amount of public information about content moderation and plan to do so, they may as well try to proceed in a way that will account for these issues.

Transparency Reports

Short of the granular information discussed above, many UGC platforms already issue regular transparency reports. Increasing expectations or commitments about what should be included in transparency reports could wind up an important way to move confidentiality norms while also ensuring that the information released is structured and meaningful.

With some variation, I've found that the majority of UGC platform transparency reports cover information across two axes. The two main types of requests are to remove/alter content and information requests. And then, within each of those categories, whether a given request comes from a private person or a government actor. A greater push for transparency might mean adding categories to these reports with more detail about the content of requests and the procedural steps taken along the way rather than just the usually binary output of "action taken" or "no action taken" that one finds in these reports, such as the law or platform rule that is the basis for removal, more detail about what relevant information was taken into account (such as, "this post was especially newsworthy because it said ..." or "this person has been connected with hate speech on [other platform]"). As pressure to filter or proactively filter platform content increases from legislators from places like Europe and Hollywood, we may want to add a category for removals that happened based on a content platform's own proactive efforts,, rather than a complaint.

Nevertheless, transparency reports as they are currently done raise questions about how to meaningfully interpret them and what can be done to improve their usefulness.

A key question I think we need to address moving forward: are the various platform companies' transparency reports apple-to-apples in their categories? Being able to someday answer yes would involve greater consistency in terms by industry (e.g, are they using similar terms to mean similar things, like "hate speech" or "doxxing," irrespective of their potentially differing policies about those types of content).

Relatedly, is there a consistent framework for classifying and coding requests received by each company. Doing more to articulate and standardize coding though maybe unexciting will be crucial infrastructure for providing meaningful classes and denominators for what types of actions people are asking platform companies to take and on what ground. Questions here include, is there relative consistency in how they each code a particular request or type of action taken in response? For example, a demand email with some elements of a DMCA notice, a threat of suit based on trademark infringement, an allegation of violation of rules/TOS based on harassment, and an allegation that the poster has action in breach of a private confidentiality agreement? What if a user makes a modification to their content of their own volition based on a DMCA or other request? What is a DMCA notice is received for one copy of a work posted by a user account, but in investigating, a content moderator finds 10 more works that they believe should be taken down based on their subjective judgment of the existence of possible red flag knowledge?

Another question is how to ensure the universe of reporting entities is complete. Are we missing some types of companies and as a result lacking information on what is out there? The first type that comes to mind is nominally traditional online publishers, like the New York Times or Buzzfeed, who also host substantial amounts of UGC, even if it is not their main line of business. Although these companies focus on their identity as publishers, they are also platforms for their own and others' content. (Section 3 of the Times' Terms of Service) spells out its UGC policy, and Buzzfeed's Community Brand Guidelines explain things such as the fact that a post with "an overt political or commercial agenda" will likely be deleted).

Should the Times publish a report on which comments they remove, how many, and why? Should they provide (voluntarily, by virtue of industry best practices, or by legal obligation) the same level of transparency major platforms already provide? If not, why not? (Another interesting question – based on what we've learned about the benefits of transparency into the processes by which online, content is published or removed, should publisher/platforms perhaps be encouraged to also provide greater transparency into non-UGC content that is removed, altered, or never published by virtue what has traditionally been considered editorial purview, such as a controversial story that is spiked at the last minute due to a legal threat or factual allegations removed from a story for the same reason? And over time, we can expect that more companies may exist that cannot be strictly classified as publisher or platform, but which should nevertheless be expected to be transparent about its content practices.) Without thinking through these question, we may lack a full data set of online expression and lose our ability to aggregate useful information about practices across types of content environments before we've started.

Alex Feerst is the Head of Legal at Medium

Filed Under: content moderation, free speech, hard choices, privacy, security, transparency

55 Comments

Follow Techdirt

Essential Reading

The Techdirt Greenhouse

Read the latest posts:

read all »

Techdirt Deals

Report this ad | Hide Techdirt ads

Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Older Stuff

Thursday
13:33	Former Employees Say Mossad Members Dropped By NSO Officers To Run Off-The-Books Phone Hacks (2)
12:01	No, Creating An NFT Of The Video Of A Horrific Shooting Will Not Get It Removed From The Internet (18)
10:49	San Francisco Cops Are Running Rape Victims' DNA Through Criminal Databases Because What Even The Fuck (18)
10:44	Daily Deal: The Complete 2022 Java Coder Bundle (0)
09:31	As Expected, Trump's Social Network Is Rapidly Banning Users It Doesn't Like, Without Telling Them Why (44)
06:30	Comcast Continues To Bleed Olympics Viewers After Years Of Bumbling (19)
Wednesday
20:42	Apple Finally Defeats Dumb Diverse Emoji Lawsuit One Year Later (6)
15:39	Clearview Pitch Deck Says It's Aiming For A 100 Billion Image Database, Restarting Sales To The Private Sector (10)
13:41	Peloton Outage Prevents Customers From Using $2,500 Exercise Bikes (16)
12:09	The GOP Knows That The Dem's Antitrust Efforts Have A Content Moderation Trojan Horse; Why Don't The Dems? (16)
10:51	Hertz Ordered To Tell Court How Many Thousands Of Renters It Falsely Accuses Of Theft Every Year (24)
09:21	Even As Trump Relies On Section 230 For Truth Social, He's Claiming In Lawsuits That It's Unconstitutional (34)
06:16	Medical, Home Alarm Industries Warn Of Major Outages As AT&T Shuts Down 3G Network (25)
Tuesday
20:37	Video Game History Foundation: Nintendo Actions 'Actively Destructive To Video Game History' (29)
15:35	Massachusetts Court Says No Expectation Of Privacy In Social Media Posts Unwittingly Shared With An Undercover Cop (17)
13:30	Techdirt Podcast Episode 312: Regulating The Internet (2)
12:03	US Copyright Office Gets It Right (Again): AI-Generated Works Do Not Get A Copyright Monopoly (60)
10:42	LA Sheriff Threatens To 'Subject' City Council To 'Defamation Law' If They Won't Stop Calling His Deputies 'Gang Members' (20)
10:37	Daily Deal: codeSpark Academy Sibling Bundle (0)
09:25	Trump's Truth Social Bakes Section 230 Directly Into Its Terms, So Apparently Trump Now Likes Section 230 (128)
06:22	15 Years Late, The FCC Cracks Down On Broadband Apartment Monopolies (31)
Sunday
12:05	Funniest/Most Insightful Comments Of The Week At Techdirt (11)
Saturday
12:00	This Week In Techdirt History: February 13th - 19th (1)
Friday
19:39	Letter From High-Ranking FBI Lawyer Tells Prosecutors How To Avoid Court Scrutiny Of Firearms Analysis Junk Science (25)
15:52	Nintendo Is Beginning To Look Like The Disney Of The Video Game Industry (44)
13:49	Seattle Public Radio Station Manages To Partially Brick Area Mazdas Using Nothing More Than Some Image Files (44)
12:13	Thankfully, Jay Inslee's Unconstitutional Bill To Criminalize Political Speech Dies In The Washington Senate (8)
10:52	How Our Convoluted Copyright Regime Explains Why Spotify Chose Joe Rogan Over Neil Young (136)
10:47	Daily Deal: The Complete Blocs Website Builder Bundle (0)
09:33	Arizona Prosecutor Who Brought Bogus Gang Charges Against Protesters Files Ridiculous Defamation Suit Against Her Boss (12)

Implementing Transparency About Content Moderation

from the not-as-easy-as-it-looks dept

The Techdirt Greenhouse

Thursday

Wednesday

Tuesday

Sunday

Saturday

Friday

More

Tools & Services

Company

Contact

More

from the not-as-easy-as-it-looks dept

Techdirt Daily Newsletter

The Techdirt Greenhouse

Tools & Services

Company

Contact

More