The Scunthorpe Problem, And Why AI Is Not A Silver Bullet For Moderating Platform Content At Scale
from the what's-in-a-name dept
Maybe someday AI will be sophisticated, nuanced, and accurate enough to help us with platform content moderation, but that day isn't today.
Today it prevents an awful lot of perfectly normal and presumably TOS-abiding people from even signing up for platforms. A recent tweet from someone unable to sign up to use an app because it didn't like her name, as well as many, many, MANY replies from people who've had similar experiences, drove this point home:
Been there
— Matt Cummings (@MattCummingsDB) August 29, 2018
As a person named James Butts, I know these problems.
— James (@justjames8) August 28, 2018
As a Dickman I know the struggle is real
— Mike Dickman (@TheMikeDickman) August 29, 2018
I get this a lot surprisingly
— Kyle Medick (@medick32) August 28, 2018
We have quite similar circumstances here
— Jacob Cockrill (@jacob_cockrill) August 29, 2018
Tom Hiscock reporting in.
— aWildWatermelon (@aWildWatermelon) August 29, 2018
Uhm, my name is Analise. That’s exact spelling. Been through this many of times.
— AP (@aannpp23) August 29, 2018
Join the club
— Craig Cockburn (@siliconglen) August 29, 2018
Oh! Am I too late to join this club?
— James Ho (@IndieVideoJames) August 29, 2018
Happens to me often, as you can imagine.
— MatthewDicks (@MatthewDicks) August 29, 2018
Facebook, despite its insistence on users using real names, seems particularly bad at letting people actually use their real names.
A large part of my family uses a shortened form of our last name because many places, including Facebook, don't think Buckmaster is a real last name.
But Buck, Buckbuck, Bucker, Bucky and many more are all "real" >.>
— The Autistech (@theAutistech) August 28, 2018
Yeah - Facebook won't allow my real name of "Talks" so had to come up with something else. Although my wife's account is okay ...
It gets better because when Collette Talks put "in a relationship with Mike Torkelson", basically the family gossip went into overdrive!
— Mike Talks 💚💚💚 (@TestSheepNZ) August 29, 2018
My last name is Player and Facebook still won’t let me have that as a last name because it’s a “street name.”
— Sav (@TheSavannahOW) August 29, 2018
I have family members who use alternate names on Facebook because it wouldn't accept Lick
— Chris Hannas (@cjhannas) August 29, 2018
But of course, Facebook is not the only instance where censorship rules based on bare pattern matching interfere not just with speech but with speaker's ability to even get online to speak.
Can’t even create my own player in a Madden franchise. Smh.
— Ben Schmuck (@benschmuck13) August 28, 2018
Ha! I had the same damn thing happen to me today when I tried to RSVP for a webinar.
— Jen Dick (@Jennifer_Dick) August 28, 2018
You're right in there with Alan Cumming, the actor, whose name was autocensored by the late City of Heroes MMO's official forums. (The COH forums also auto-nixed Dick Grayson, which was... amusing... on a forum where superheroes got discussed a lot.)
— The Phantom of the Ottoman (@zgryphon) August 28, 2018
This dynamic is what's known as the Scunthorpe Problem. Scunthorpe is a town in the UK whose residents have had an appallingly difficult time using the Internet due to a naughty word being contained within the town name.
The Scunthorpe problem is the blocking of e-mails, forum posts or search results by a spam filter or search engine because their text contains a string of letters that are shared with another (usually obscene) word. While computers can easily identify strings of text within a document, broad blocking rules may result in false positives, causing innocent phrases to be blocked.
The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, North Lincolnshire, England from creating accounts with AOL, because the town's name contains the substring cunt. Years later, Google's opt-in SafeSearch filters apparently made the same mistake, preventing residents from searching for local businesses that included Scunthorpe in their names.
(A related dynamic, the Clbuttic Problem, creates issues of its own when, instead of outright blocking, software automatically replaces the allegedly naughty words with ostensibly less-naughty words instead. People attempting to discuss such non-purient topics as Buttbuttin's Creed and the Lincoln Buttbuttination find this sort of officious editing particularly unhelpful…)
While examples of these dynamics can be amusing, each is also quite chilling to speech, and to speakers wishing to speak.
With the last name ‘Dicks’, I have to remind people to check their spam folder more often than a Nigerian prince.
— Chain of Lynx (@chainoflynx) August 28, 2018
The word Spam is literally in my last name. My husband’s family warned me that my last name can/will be marked as spam.
— Angela Spampata (@bird5445) August 29, 2018
Used to work with a lady whose last name is Wang, and it took us a few days to add exceptions to all the email filters
— Destroyer of Jeeps (@NewKindOfClown) August 29, 2018
It's not something we should be demanding more of, but every time people call for "AI" as a solution to online content challenges these are the censoring problems the call invites.
A big part of the problem is that calls for "AI" tend to treat it like some magical incantation, as if just adding it will solve all our problems. But in the end, AI is just software. Software can be very good at doing certain things, like finding patterns, including patterns in words (and people's names…). But it's not good at necessarily knowing what to make of those patterns.
— michelle 💞 (@tenderdamie) August 29, 2018
Our net Nanny at work flagged a co-worker for offensive language. He dealt with a lot of crane contractors. Net nanny told his boss he was sending lots of emails with the word erection. Lol.
— GoGoATL (@GoGoATL) August 28, 2018
More sophisticated software may be better at understanding context, or even sometimes learning context, but there are still limits to what we can expect from these tools. They are at best imperfect reflections of the imperfect humans who created them, and it's a mistake to forget that they have not yet replicated, or replaced, human judgment, which itself is often imperfect.
Which is not to say that there is no role for software to help in content moderation. The things that software is good at can make it an important tool to help support human decision-making about online content, especially at scale. But it is a mistake to expect software to supplant human decision-making. Because, as we see from these accruing examples, when we over-rely on them, it ends up being real humans that we hurt.
Had this on a website for the kids, the kids demanded to know why, our last name is ‘Clithero’ Interesting conversation. 😳
— DougHero 🇬🇧 (@ClitheroDoug) August 29, 2018
I know that feel pic.twitter.com/nMbjfTKGcZ
— Nazi Paikidze-Barnes (@NaziPaiki) August 29, 2018
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: ai, artificial intelligence, content moderation, language, natalie weiner, scunthorpe
Reader Comments
Subscribe: RSS
View by: Time | Thread
I remember the story, from around 7-8-ish years ago, of a guy named Mark Zuckerberg who had a heck of a time signing up for a Facebook account, because its automated filters kept flagging him as fraudulently attempting to impersonate their founder, despite multiple manual interventions and appropriate documentation provided that yes, this was in fact his real, legal name.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Senator Chris Coon and company
Raccoon enthusiasts must also be annoyed.
[ link to this | view in chronology ]
Re: Senator Chris Coon and company
Why do you think they started referring to those little disease vectors as “trash pandas”?
[ link to this | view in chronology ]
Duplicate Problem
But dayum, don't you think I should be able to sign my name Tom, Dick, or Harry?? lol (or *Blue*, here's grinning at TD!)
And Facebook, grow the fuck up, or I'll have to shove something in someone's Scunthorpe, just like in a Philip K Dick novel involving Wang computers, or was that an ee cummings poem?
[ link to this | view in chronology ]
Re: Duplicate Problem
There was a story I saw online about someone who found two records in their student database, differing only by sex. Same name, birthdate, address. It ended up being two married students—last name and address shared due to marriage, and shared birthdates happen when most people start at the same age.
Handles are probably better than "real" names at avoiding these problems.
[ link to this | view in chronology ]
Re: Re: Duplicate Problem
One twerp on Twitter told me I should change it, but hell, no. It's my name and it's up to all the stupid little weenies to grow the hell up. Then go look up British place names to find more things to be artificially offended about. The seaside ones are the funniest.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re:
If you don't care for that, fair enough, but it's no mystery why people who want to talk to family and friends they may have previously lost contact with wish to make themselves easy to find.
[ link to this | view in chronology ]
Re: Re:
That's not a fact, it's an opinion. Many people use it partially or entirely to converse with people they've never met.
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
I have a very common name and you couldn't find me on Google very easily, but if you search for my name on Facebook you will see me listed along with a recognisable photo. I'll probably come up fairly early in the list if we were to share some contacts. I've caught up with a lot of lost acquaintances I made pre-social media that way, which may not have happened had I used some kind of unique pseudonym (since people who had lost contact wouldn't know what to search for).
I do also know people who use pseudonyms exclusively on there, but they tend to be the people deliberately trying to keep old friends away from them, which is not the majority in my experience.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
I have a cousin with the spampata last name
[ link to this | view in chronology ]
Re: I have a cousin with the spampata last name
[ link to this | view in chronology ]
Re: I have a cousin with the spampata last name
[ link to this | view in chronology ]
AI Is Not A Silver Bullet
[ link to this | view in chronology ]
Re: AI Is Not A Silver Bullet
[ link to this | view in chronology ]
Re: Re: AI Is Not A Silver Bullet
Nah, those are "power points"; don't you know anything?
-- Micro Soft (which name is also banned as derogatory member dissing)
[ link to this | view in chronology ]
Re: AI Is Not A Silver Bullet
Our singing group could never exist with today's P.C. filters.
-- (They Say We're) The Monkees
[ link to this | view in chronology ]
Re: Re: AI Is Not A Silver Bullet
"... we're too busy singing (lipsyncing ?)
To put anybody down."
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Old but gold: http://www.cracked.com/blog/5-reasons-diablo-iii-represents-gamings-annoying-future/
Ppl need to stop being stupid moralists. Dicks, pussies and other bodily functions should have stopped being taboo for a long time now. Facebook and other platforms overmoderating are just a symptom of our stupid moralism.
[ link to this | view in chronology ]
Some people's parents!
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
Aaaaaand, ironically my comment filled with all those words got held for moderation. Laughing like a maniac here lmao
Ha! Hilarious. I just cleared it... Sorry about that, but... yeah.
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Ai is interesting..
The better it is, the longer it is, the SLOWER it is..
There are ways to make things faster, but then we ADD to the AI, and make it even slower..
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
Cheap and easy..
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
-- Ben Dover
[ link to this | view in chronology ]
Also, remember that story about some Christian oriented browsing / publishing filter that changed well know runner Tyson Gay's name to Tyson Homosexual and actor Dick van Dyke's name to Penis van Lesbian?
And who could forget the kerfuffle over the naming of the Harry Baals Government Center. https://en.wikipedia.org/wiki/Harry_Baals
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]