Techdirt's think tank, the Copia Institute, is working with the Trust & Safety Professional Association and its sister organization, the Trust & Safety Foundation, to produce an ongoing series of case studies about content moderation decisions. These case studies are presented in a neutral fashion, not aiming to criticize or applaud any particular decision, but to highlight the many different challenges that content moderators face and the tradeoffs they result in. Find more case studies here on Techdirt and on the TSF website.

Content Moderation Case Study: Twitter's Algorithm Misidentifies Harmless Tweet As 'Sensitive Content' (April 2018)

Content Moderation

from the content-moderation-isn't-easy dept

Fri, Sep 25th 2020 3:30pm — Copia Institute

Summary: While some Twitter users welcome the chance to view and interact with "sensitive" content, most do not. Twitter utilizes algorithms to detect content average users would like to avoid seeing, especially if they've opted in to Twitter's content filtering via their user preferences.

Unfortunately, software can't always tell what's offensive and what just looks offensive to the programmable eye that constantly scans uploads for anything that should be hidden from public view unless the viewer has expressed a preference to see it.

A long-running and well-respected Twitter account that focused on the weirder aspects of Nintendo's history found itself caught in Twitter's filters. The tweeted image featured an actor putting on his Princess Peach costume. It focused on the massive Princess Peach head, which apparently contained enough flesh color and "sensitive" shapes to get it -- and the Twitter account -- flagged as "sensitive."

The user behind the account tested Twitter to see if it was its algorithm or something else setting off the "sensitive" filter. Dummy accounts tweeting the image were flagged almost immediately, indicating it was the image -- rather than other content contained in the user's original account -- that had triggered the automatic moderation.

Unfortunately, the account was likely followed by several users who never expected it to suddenly shift to "sensitive" content. Thanks to the algorithm, the entire account was flagged as "sensitive," possibly resulting in the account losing followers.

Twitter ultimately removed the block, but the user was never directly contacted by Twitter about the alleged violation.

Decisions to be made by Twitter:

Are false positives common enough that a notification process should be implemented?
Should the process be stop-gapped by human moderators? If so, at what point does double-checking the algorithm become unprofitable?
Would a challenge process that involved affected users limit collateral damage caused by AI mistakes?
Does sensitive content negatively affect enough users that over-blocking/over-moderation is acceptable?

Questions and policy implications to consider:

Should Twitter change its content rules to further deter the posting of sensitive content?
Given Twitter's reputation as a porn-friendly social media platform, would stricter moderation of sensitive content result in a noticeable loss of users?
Should Twitter continue to remain one of the only social media outlets that welcomes "adult" content?
If users are able to opt out of filtering at any point, is Twitter doing anything to ensure younger users aren't exposed to sensitive material?

Resolution: Twitter removed the flag on the user's account. According to the user behind the account, it took the work of an employee "behind the scenes" to remove the "sensitive content" warning. Since there was no communication between Twitter and the user, it's unknown if Twitter has implemented any measures to limit future mischaracterizations of uploaded content.

Filed Under: content moderation, misidentification, offensive content
Companies: twitter

19 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

Samuel Abram (profile), 25 Sep 2020 @ 3:47pm

"flesh color"
The thing is, not all flesh color is pasty white; there is also flesh color that is darker in complexion, and unfortunately, many places in Silicon Valley fail to take that into account. The reason why it's relevant to this article is that darker skin tones don't always get tested and lighter skin tones get over-tested, so you have that disparity and whereas over-testing results in many false positives, false negatives can permeate when it comes to people with darker skin tones because the lack of diversity in skin color in Silicon Valley's firms mean that people with more melanin don't get true positives, let alone false ones.
[ link to this | view in thread ]
Anonymous Coward, 25 Sep 2020 @ 4:05pm

My question is: If they are using "AI", why does it never seem to get any better at its job?

I've watched people put video and images through Google's API for their cloud machine learning wotsit, and it is frequently incredibly awful.

(I don't know where the interface lives, or if it still does, (or one needs an account) but this
https://cloud.google.com/vision/docs/how-to and in particular the Safe Search bit. )
[ link to this | view in thread ]
Anonymous Coward, 25 Sep 2020 @ 8:25pm

Re:
Because "AI" isn't some kind of magical label that you can slap onto something to make it better. As of now, anything labeled "AI" is either (1) one of several variations on optimization algorithms which are not explicitly coded or (2) labeled incorrectly for marketing reasons.

Assuming that it's (1), it still runs into the same problems as any other optimization algorithm... access to additional training data does not guarantee improvement, and is routinely either wholly or partially detrimental.

This is exacerbated in many cases due to the sparsity of labeled inputs relative to all potential inputs, that labeled inputs are not representative (it's often not even possible to define what such representative sets would look like with our current understanding of the field), and that significant portions of those labeled inputs are internally inconsistent due to disagreement among human moderators, changes in strategy over time, accidental separation from context, etc.

You will find in particular that nobody has yet found a way to generally recognize the contents of an image (though some progress has been made on recognizing specific types of images eg human faces, landmarks). What algorithms there are simply don't "see" images in any way similar to how humans see images; most image algorithms still struggle to reliably ignore compression artifacts in otherwise identical images... something that many humans don't even notice.
[ link to this | view in thread ]
PaulT (profile), 26 Sep 2020 @ 12:26am

Re:
"My question is: If they are using "AI", why does it never seem to get any better at its job?"

It does. It's just that identifying subjective content will never be perfect, and the best it can ever do is the same as a human being. Who will never be perfect at such a task, given that you can sit 100 human beings in a room and they will never agree on a subject. I'll guarantee that if you did so, one of the 100 people would have flagged the above image.

The advantages of AI in this setting is speed and volume of processing. If you want accuracy surrounding subjective material, you want magic.
[ link to this | view in thread ]
Anonymous Coward, 26 Sep 2020 @ 4:42am

It might be easier to understand moderation if you think about something that is universally despised. Like politics, but less complicated. Say ... SPAM.

Everyone knows what spam is. Everyone agrees that in a perfectly just world spammers would be slow-cooked while their skin was being removed by an acid mist--before their bones were ground up to make latrine bricks.

Gmail does an extremely good job of filtering out spam. And yet, and yet--who hasn't (very occasionally) seen important email show up in their spam folder? And the spammers are still operating, so apparently enough spam is getting through the filters to make the abhorrent habit profitable.

How do you react?

Well, if you're an insane egocentric idiot, you immediately go across the web, posting that Google has it in for your bank, or nonprofit org, or second-cousin-once-removed, because THEIR email was deprecate, whereas some other parties' email did not get filtered. You get your congresscritter (whichever side of the aisle they lair and liar in) to fulminate and spray froth all over the Sacred Halls of Our Republic. And you wrap yourself in a putrescent cloak of victimhood.

If you are sane, or less stupid than yeast, or you have any consideration at all for the difficulties other people are having in their quest to make your online experience less painful, then you try a different approach. In fact, even if you're a viscious spammer, you take the different approach!

You look carefully at the deprecated email, looking for words or word-parts that could appear (to a stupid computer, not that there is any other kind) to be commercial/promotional. You look at the email address and sending server and linked-to sites to see if they show patterns that are commonly associated with spam. You remove your second cousin, bank, and charitable org from the blacklist and add them to the whitelist. And, if you're a spammer, you try to recraft your spam so as not to LOOK like spam to the stupid computer. YOU DO NOT TAKE IT PERSONALLY, BECAUSE THE STUPID COMPUTER IS NOT A PERSON; IT IS CONTROLLED BY AN ARITHMETIC EXPRESSION, NOT A SOUL.

And if life is still sometimes complicated, frustrating, and inexplicable--welcome to the human condition.
[ link to this | view in thread ]
Samuel Abram (profile), 26 Sep 2020 @ 6:56am

Re:

YOU DO NOT TAKE IT PERSONALLY, BECAUSE THE STUPID COMPUTER IS NOT A PERSON; IT IS CONTROLLED BY AN ARITHMETIC EXPRESSION, NOT A SOUL.

It reminds me of what I asked my father when I was young:

Me: "Daddy, are computers perfect?"
My father: "Computers aren't perfect, but they expect us to be perfect."

If anything, computers are only as good as what we make out of them.
[ link to this | view in thread ]
PaulT (profile), 26 Sep 2020 @ 8:43am

Re: Re:
"My father: "Computers aren't perfect, but they expect us to be perfect.""

The better way of explaining this is the old adage GIGO - Garbage In, Garbage Out. Computers do perfectly do what they're told to do. But, a human operator needs to tell them what to do, and they are far from perfect. If someone gives them a bad instruction, be that the coder who created the programs they run, or a user not using the program correctly, they will perfectly follow the bad instruction.
[ link to this | view in thread ]
Rocky, 26 Sep 2020 @ 8:59am

Re: Re: Re:

"Stupid computer! It doesn't do what I want, only what I tell it to do!!"

[ link to this | view in thread ]
Anonymous Coward, 26 Sep 2020 @ 12:14pm

Re: Re:
The best thing about computers is that they do exactly what you tell them to do. You can't say that about a lot of people. The worst thing about computers is that they do exactly what you tell them to do. No matter how stupid that is.
[ link to this | view in thread ]
SpaceLifeForm, 26 Sep 2020 @ 12:36pm

The Paradox of Insanity
Einstein:

The definition of insanity is doing the same thing over and over and expecting different results.

Programmer:

I keep running the same program over and over and I keep getting different results!
[ link to this | view in thread ]
Tin-Foil-Hat, 26 Sep 2020 @ 2:03pm

There should be some rules
Youtube is notorious. The encourage users to create content and when they quit their day job to create content YouTube demonetizes them for some unknown reason. It's difficult to be reinstated too.

There really should be some obligation. They want business owners to use the service but when the business' communication is shut down the platform is 100% void of responsibility even though the business is harmed.

Youtube, Twitter and Facebook should be lower priority methods of communication if you care about consistency and reliability.
[ link to this | view in thread ]
Stephen T. Stone (profile), 26 Sep 2020 @ 2:58pm

They want business owners to use the service but when the business' communication is shut down the platform is 100% void of responsibility even though the business is harmed.

Why should we hold YouTube responsible for the decision of a third party to rely on one service so heavily that getting the boot from said service can fuck up their entire business model?
[ link to this | view in thread ]
Samuel Abram (profile), 26 Sep 2020 @ 4:11pm

Re: There should be some rules
Had you said this on Twitter, I would screencap this comment (or tweet) and send it to the @badsec230takes account.

But since it's on TechDirt, I'll just show you this.
[ link to this | view in thread ]
BernardoVerda (profile), 26 Sep 2020 @ 4:20pm

Twitter is "porn-friendly"?

That's... odd.
I personally see very little, if any, porn on Twitter.
But I have actually turned off the so-called "sensitive content" filter -- simply because it was blocking so much stuff, that wasn't porn, and wasn't sensitive (and wasn't even NSFW).
[ link to this | view in thread ]
Anonymous Coward, 26 Sep 2020 @ 5:41pm

Re: Re:
Well sure, no one will ever agree on subjective matters, but i do not see any improvement in the putative machine learning for identifying even nudity or "raciness". Fine, it will of course be based on someone's operant definition of "racy" or whatever, but lol a black and white photo of a cartoonish costume head (among endless other things)? No, no improvements there.
[ link to this | view in thread ]
Anonymous Coward, 26 Sep 2020 @ 5:55pm

Re: Re:
Labels: Yeah, that's why "AI" is in quotes. AI is a field of study, not a product, and certainly not even a working model anywhere. Machine learning is central to AI, and still super loosey-goosey as to whether any of that works in any of the domains where people apparently really really want to use it. (Remember expert systems? I guess that term is like time-sharing is to the cloud.)

Anyway, my starting assumption is that AI and machine learning are indeed not what they are portrayed to be. I suppose i should have directly asked "if they are so bad at rating, classification, or identification, and do not seem to improve over years of input*, why are they being used at all?" (OK, the answer to that is the dosomething pressure and the fact that people like releasing pre-alpha crap into production - the good-enough philosophy. I should not have bothered writing anything i suppose.)

*Input: If there isn't manual review more often, the negative feedback is never input to the system...
[ link to this | view in thread ]
Anonymous Coward, 26 Sep 2020 @ 6:04pm

Re:
I would expect that one would have to know where to look, and once you found a few instances, Twitter would then recommend them to you in the future.
[ link to this | view in thread ]
PaulT (profile), 27 Sep 2020 @ 11:30pm

Re: There should be some rules
"The encourage users to create content and when they quit their day job to create content YouTube demonetizes them for some unknown reason."

Perhaps the problem isn't YouTube, but the idiot who decided to base his entire business on a single supplier, then violated the T&Cs of that supplier's contract?

"They want business owners to use the service but when the business' communication is shut down the platform is 100% void of responsibility even though the business is harmed."

There's not zero recourse. But, the user is not their customer, and if the user decides to violate YouTube's policies in a way that puts off their paying customers (i.e. advertisers), YouTube do not have an obligation to throw free money at people who are losing it customers.

"Youtube, Twitter and Facebook should be lower priority methods of communication if you care about consistency and reliability."

Perhaps true. So why, in your example, is the user who decided to base their entire business on an unreliable and inconsistent platform not responsible for that decision?
[ link to this | view in thread ]
Anonymous Coward, 28 Sep 2020 @ 8:22am

Banned from FB for violating unknowable "community standards"
4 times, banned, for posting images that violated "community standards". Several of the images were also posted by political groups who weren't banned. One was "Don't wash your MAGA hat with your klan outfit". I cannot post any image which suggests the GOP are similar to Nazis. No swastikas, etc. Yet I'm banned now for 30 days. Nice, right?
[ link to this | view in thread ]