No, Google Isn't Hiding Elizabeth Warren's Emails To Promote Mayor Pete
from the not-how-it-works dept
Content moderation at scale is impossible. This time, it's email content moderation. This week a new publication called The Markup launched. It's a super smart group of folks who are doing deep data-driven investigative reporting of companies in and around the tech space -- and I'm very excited to see what they do. I was going to write about the project overall and its goals, but instead I'm going to write about one of its first stories, done in partnership with the Guardian, entitled Swinging the Vote?, and which looks at Gmail's filtering system, specifically as it regards political emails from Presidential candidates.
A few years back, Google added the "Promotions" tab to Gmail, as a way of hopefully, automagically sorting not-quite-spam emails, but general promotional emails that you probably don't want cluttering up your inbox. Personally, I don't use it, as I use a different filtering setup entirely that overrides Gmail's defaults. However, for many people it's proven to be quite useful. The reporters at The Markup conducted a worthwhile experiment:
The Markup set up a new Gmail account to find out how the company filters political email from candidates, think tanks, advocacy groups, and nonprofits.
We found that few of the emails we’d signed up to receive —11 percent—made it to the primary inbox, the first one a user sees when opening Gmail and the one the company says is “for the mail you really, really want.”
Half of all emails landed in a tab called “promotions,” which Gmail says is for “deals, offers, and other marketing emails.” Gmail sent another 40 percent to spam.
Very interesting! What was perhaps even more interesting was the chart -- which quickly rocketed around social media -- showing that some candidates had their emails go into the Primary Inbox at a much, much higher rate than others:
You'll notice a few standouts. 63% of Pete Buttigieg's emails made it to the Inbox, as did 47% of Andrew Yang's. Everyone else was much closer to 0% with quite a few -- including both Elizabeth Warren and Joe Biden -- at 0%.
The reporters at The Markup also published a companion piece that gives the details of how they went about doing this research and (yes!) they even provide the data and the code on Github. This is a fantastic and transparent way of doing such journalism -- and I applaud them for that.
However, the very framing of the original story itself... is problematic. It's one thing to be open about how you conducted the research. But starting with a title like "Swinging the vote" and highlighting the chart above almost immediately resulted in lots of people on Twitter assuming (or suggesting) that Google was doing this deliberately, and that they were purposely making the decision to tilt the playing field towards Buttigieg. This includes vocal big tech critic Roger McNamee, who declared this was evidence that "Gmail has its thumb in the scale." Another Google critic, who is fond of misleading conspiracy theories about the company, called it "election meddling" and claims that Google was giving certain candidates "special treatment."
Except... that's almost certainly not the case. No one at Google on the Gmail spam team is thinking about promoting one Presidential candidate over another. Instead, this is just yet another example of Masnick's Impossibility Theorem, but applied to email moderation, rather than social media. Content moderation at scale is impossible to do well and will always piss off some people.
Indeed, looking over the data, the most obvious and most likely solution is simply this: Buttigieg and Yang hired competent email marketers who know how to craft emails that are (1) less likely to trigger the algorithm, and (2) less likely to be clicked on as spam by users (an important signal that feeds back into the algorithm). The rest of the candidates... did not. And thus, their emails went to the promotions and spam folder because they had characteristics that are more closely associated with promotions and spam. And, yet, The Markup story doesn't bother to get into any of that -- and thus leaves the speculation wide open, allowing plenty of folks to leap in.
Again, I'm super excited about The Markup as a project and believe it will put out plenty of important and impactful journalism in the days, weeks, months and years to come. I recommend people read over The President's Letter from the site's President Nabiha Syed (a past podcast guest) and Editor's Letter from Julia Angwin -- both of which present a compelling vision of what The Markup will be.
But this story shows how important context is in presenting a story. This is a data driven story -- which is great. But if the necessary context is not provided, especially on a topic so fraught with speculation, people are going to rush in and jump to conclusions. The Markup itself did not directly say that Google was doing this deliberately, but its total failure to suggest why this might be happening, along with a cringe-worthy headline, opened the door for others to jump in and assume as much -- and that's a shame.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: andrew yang, content moderation, content moderation at scale, elizabeth warren, email, gmail, pete buttigieg, political emails, promotions, spam
Companies: google
Reader Comments
Subscribe: RSS
View by: Time | Thread
Deeper problem than moderation at scale
Really these attitudes are far worse than trying to moderate a large collection automatically and consistently without errors. It is worse - entitled politicians and plebs who believe that if you do not promote their world view you are conspiring against them. This is Roko's Basilisk but with a millions of different insane intelligences whom wish to punish you for not doing the impossible which you had no way of knowinf instead of one.
[ link to this | view in chronology ]
"No one at Google on the Gmail spam team is thinking about promoting one Presidential candidate over another."
I really hope you're not that naive, but it seems you've hung your argument on this obviously false premise.
[ link to this | view in chronology ]
Re:
Please prove that it is false. It should be easy since it's so obvious...
[ link to this | view in chronology ]
Re:
It's not a false premise. If you actually talked to people who work on those teams, you'd know it's true. But delusional, conspiratorial people assume otherwise. They're wrong, as are you.
I can assure you that literally no one on the Gmail spam team is thinking "how do we promote Pete and harm other candidates." It's not a thing.
[ link to this | view in chronology ]
Re: Re:
I don't know... there could be one person on the team thinking that way... but they'll get filtered out pretty quickly.
At the management level, that's definitely not a driver, and speaking as someone who's worked on one of those teams, the entire UI and workflow is set up in such a way as you'd never even consider applying that thought process to the filtering. It just doesn't make sense.
Could someone train the algorithms to have a bias? Yes... but that would mess up so much other stuff that nobody would want to throw THAT spanner in the works.
[ link to this | view in chronology ]
Re: Re:
It may not necessarily be a false premise, but you appear to have created an unfalsifiable one.
What do I mean? Well, let's suppose, hypothetically for the purposes this discussion, that Google did have their thumb on the scale. What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem? Would anything short of a confession (or leak of internal Google documents) work? If not, you have an unfalsifiable premise, and therefore (given the subject matter) a rather dangerous one.
[ link to this | view in chronology ]
Re: Re: Re:
"What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem?"
Oh, I don't know, how about...any evidence?
[ link to this | view in chronology ]
Re: Re: Re: Re:
I'd want to hold Mike to that standard too, though, when we see broad statements like "nobody is thinking about it". I see no reason to believe Google's trying to promote any candidate, but I find it hard to believe these thoughts would not have occurred to literally anyone on the Gmail spam team. People should be thinking of ways these systems could be abused or could have unintended effects. Especially people working on moderation tools.
BTW, does Google provide any data on how often people mark particular messages as spam?
[ link to this | view in chronology ]
The phrasing maybe could’ve been better, sure, but the broader point still stands: Gmail’s staff isn’t actively trying to favor one candidate over another and anyone who makes that extraordinary claim needs to back it up with some extraordinary evidence.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re:
"I find it hard to believe these thoughts would not have occurred to literally anyone on the Gmail spam team"
That's good, since the use of a colloquial term does not mean that it's never occurred to anyone ever. It simply means that nobody is consciously having the issue at the forefront of their mind when managing a system that has millions of uses outside of these campaigns
"People should be thinking of ways these systems could be abused or could have unintended effects. Especially people working on moderation tools"
...and candidates' teams should be thinking about that when designing their campaigns, rather than (as is one suggestion here) simply whining when it turns out their competition is better managed.
"BTW, does Google provide any data on how often people mark particular messages as spam?"
Which messages? If you mean the specific emails sent by these campaigns, it's likely that few of them were sent to the spam box manually.
[ link to this | view in chronology ]
Re: Re: Re:
The alternative premise Mike draws is that the content of Buttigieg emails are crafted to appear as email users really want, but that Warren emails appear more 'promotional', as not-quite-spam, rather than providing special treatment to Buttigieg.
There are a number of ways to demonstrate either that gmail is providing special treatment to Buttigieg or that contents of these emails is a factor. For instance, Sending the contents of an allowed Buttigieg campaign email and a promotional Warren Campaign email from generic corporate emails could produce data on if the contents was a factor over the FROM header.
But Mike doesn't need to prove his thesis. He is only producing a reasonable alternative explanation. He is not saying "this is how it is". He is saying the conclusions drawn from The Markup are half baked. They have not proved the implication that Google has their finger on the scale any more than Mike has proven his case. And Mike's Alternative idea, that rather than a shadowy coder assigning political emails to different filters to benefit Pete Buttigieg the 2 younger candidates (Buttigeg and Yang) hired email marketers that understand how to get an email around Promotions makes a significant amount of sense. And if there are not hidden hands helping Buttigieg, it makes sense he has one of the highest SPAM rates.
[ link to this | view in chronology ]
Re: Message content vs. other factors
Yes, but it's quite hard to perform such a test because you'd need to vary dozens of factors, including DKIM/SPF/DMARC configuration, source domains, IP addresses in Received headers, encryption, user-agent, HTML formatting, message IDs and references, volume and timing of emails across all of these, previous interaction between the sender and the recipient etc. etc. All of which is a moving target because you cannot know what the actual spammers are doing at any given moment which may change the impact of any of those variables.
There are a few companies which claim to be measuring the deliverability in the wild but it's not clear how they do it or how reliable it might be.
[ link to this | view in chronology ]
Re: Re: Re:
"What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem?"
What evidence do you have that can't be explained with simple application of Occam's Razor?
[ link to this | view in chronology ]
Re: Re:
Unless you personally asked each member of the GMail Team their political preference, you would have been better off to have started that sentence with "I don't believe that any of the GMail Team ..."
While you may be correct, I'm going to take your statement with a grain of salt.
[ link to this | view in chronology ]
If I'm Mayor Pete's email marketing manager I asked for a very large raise yesterday.
[ link to this | view in chronology ]
Simpler explanations
That was my guess before looking at the data, but after looking briefly at it I'd question whether this is really the case. The content and features of the emails might actually be rather similar; maybe Yang and Buttigieg used less popular software, not as well known to Gmail's filter as NationBuilder and ActBlue which are used by many of the others.
[ link to this | view in chronology ]
Re: Simpler explanations
So in other words... Yang and Buttigieg developed emails that 1) are less likely to trigger the algorithm...
[ link to this | view in chronology ]
Re: Re: Simpler explanations
That suggests intentionality, which the data doesn't seem confirm. Also, if you look at the spam rates the Yang campaign is quite bad (maybe all that talk about 1000 $ sounds too similar to Nigerian scams? :-) ) and Buttigieg isn't so good either.
[ link to this | view in chronology ]
Re: Re: Simpler explanations
"Yang and Buttigieg developed emails that 1) are less likely to trigger the algorithm..."
That's what occam's razor provides for an answer.
Now if I designed an anti-spam filter the very first thing i'd put in it was some of the key statements usually found in political boilerplate. If Yang and Buttigieg ended up NOT drawing on established campaign procedure their campaign emails are far more likely to not contain the specific trigger phrases the anti-spam algorithm has been programmed to register as "junk".
[ link to this | view in chronology ]
Google is absolutely manipulating the results. Those political emails from every candidate deserve to be sent straight to the trash can.
[ link to this | view in chronology ]
Really, anyone should just look at what gets tossed into the Promotional (or any other) category, and what does not. It's pretty obvious how crap the sorting is.
[ link to this | view in chronology ]
how google works aint open source, hunny.
[ link to this | view in chronology ]
Re:
Relevance, sweetiebuns?
[ link to this | view in chronology ]
Re:
Funny thing - the people who hate Google in the RIAA and government? They're usually not fans of Open Source, either.
[ link to this | view in chronology ]
BUT BUT BUT THEY WERE ALL CAMPAIGN EMAILS!
The problem is not all campaign emails are the same.
Some follow a "tried & true" pattern that they have always used & OMG those get flagged as promotions b/c it looks like 5000 other promotion emails.
Sometimes the "tried & true" pattern is adopted by scammers to look more legit. (Side note 45 emails about my apple account sent to an email I rarely use... some look convincing).
Everyone wants to find a pattern of abuse to tie together all of the crazy ideas & make them be true... and not just unrelated things that sometimes happen.
People believe stupid things.
When the No Call List was suggested, they quickly moved to make sure political calls didn't get stopped.... because everyone loves politicians so much. I'm sure there were evil thoughts about how 1 party would block the other.
Why did they never consider making political calls opt-in?
Why not let people who don't want 300 surveys & campagin calls not get them??
If politics is so beloved of course people would opt-in!!
Oh and we need to keep those robocallers in business, even if in doing this we cleared the path for bad actors to get confirmed lists of numbers to abuse.
[ link to this | view in chronology ]
Cool experiment, Crappy journalism based on a Shitty assumption
Yes, MM. Nice article (by you) and cool experiment (by them).
But the journalism is sensational rubbish. What fraction of a fraction of a percentage of US voters will make their voting decision from the subscribed emails?
Eh? Batshit crazy.
And the crazier bit is that people are actually taking this seriously. Its like RussiaGate (another hot tub of rubbish) has blown peoples minds apart.
[ link to this | view in chronology ]