No, Google Isn't Hiding Elizabeth Warren's Emails To Promote Mayor Pete

from the not-how-it-works dept

Thu, Feb 27th 2020 12:14pm — Mike Masnick

Content moderation at scale is impossible. This time, it's email content moderation. This week a new publication called The Markup launched. It's a super smart group of folks who are doing deep data-driven investigative reporting of companies in and around the tech space -- and I'm very excited to see what they do. I was going to write about the project overall and its goals, but instead I'm going to write about one of its first stories, done in partnership with the Guardian, entitled Swinging the Vote?, and which looks at Gmail's filtering system, specifically as it regards political emails from Presidential candidates.

A few years back, Google added the "Promotions" tab to Gmail, as a way of hopefully, automagically sorting not-quite-spam emails, but general promotional emails that you probably don't want cluttering up your inbox. Personally, I don't use it, as I use a different filtering setup entirely that overrides Gmail's defaults. However, for many people it's proven to be quite useful. The reporters at The Markup conducted a worthwhile experiment:

The Markup set up a new Gmail account to find out how the company filters political email from candidates, think tanks, advocacy groups, and nonprofits.

We found that few of the emails we’d signed up to receive —11 percent—made it to the primary inbox, the first one a user sees when opening Gmail and the one the company says is “for the mail you really, really want.”

Half of all emails landed in a tab called “promotions,” which Gmail says is for “deals, offers, and other marketing emails.” Gmail sent another 40 percent to spam.

Very interesting! What was perhaps even more interesting was the chart -- which quickly rocketed around social media -- showing that some candidates had their emails go into the Primary Inbox at a much, much higher rate than others:

You'll notice a few standouts. 63% of Pete Buttigieg's emails made it to the Inbox, as did 47% of Andrew Yang's. Everyone else was much closer to 0% with quite a few -- including both Elizabeth Warren and Joe Biden -- at 0%.

The reporters at The Markup also published a companion piece that gives the details of how they went about doing this research and (yes!) they even provide the data and the code on Github. This is a fantastic and transparent way of doing such journalism -- and I applaud them for that.

However, the very framing of the original story itself... is problematic. It's one thing to be open about how you conducted the research. But starting with a title like "Swinging the vote" and highlighting the chart above almost immediately resulted in lots of people on Twitter assuming (or suggesting) that Google was doing this deliberately, and that they were purposely making the decision to tilt the playing field towards Buttigieg. This includes vocal big tech critic Roger McNamee, who declared this was evidence that "Gmail has its thumb in the scale." Another Google critic, who is fond of misleading conspiracy theories about the company, called it "election meddling" and claims that Google was giving certain candidates "special treatment."

Except... that's almost certainly not the case. No one at Google on the Gmail spam team is thinking about promoting one Presidential candidate over another. Instead, this is just yet another example of Masnick's Impossibility Theorem, but applied to email moderation, rather than social media. Content moderation at scale is impossible to do well and will always piss off some people.

Indeed, looking over the data, the most obvious and most likely solution is simply this: Buttigieg and Yang hired competent email marketers who know how to craft emails that are (1) less likely to trigger the algorithm, and (2) less likely to be clicked on as spam by users (an important signal that feeds back into the algorithm). The rest of the candidates... did not. And thus, their emails went to the promotions and spam folder because they had characteristics that are more closely associated with promotions and spam. And, yet, The Markup story doesn't bother to get into any of that -- and thus leaves the speculation wide open, allowing plenty of folks to leap in.

Again, I'm super excited about The Markup as a project and believe it will put out plenty of important and impactful journalism in the days, weeks, months and years to come. I recommend people read over The President's Letter from the site's President Nabiha Syed (a past podcast guest) and Editor's Letter from Julia Angwin -- both of which present a compelling vision of what The Markup will be.

But this story shows how important context is in presenting a story. This is a data driven story -- which is great. But if the necessary context is not provided, especially on a topic so fraught with speculation, people are going to rush in and jump to conclusions. The Markup itself did not directly say that Google was doing this deliberately, but its total failure to suggest why this might be happening, along with a cringe-worthy headline, opened the door for others to jump in and assume as much -- and that's a shame.

Filed Under: andrew yang, content moderation, content moderation at scale, elizabeth warren, email, gmail, pete buttigieg, political emails, promotions, spam
Companies: google

26 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

Anonymous Coward, 27 Feb 2020 @ 12:56pm

Deeper problem than moderation at scale
Really these attitudes are far worse than trying to moderate a large collection automatically and consistently without errors. It is worse - entitled politicians and plebs who believe that if you do not promote their world view you are conspiring against them. This is Roko's Basilisk but with a millions of different insane intelligences whom wish to punish you for not doing the impossible which you had no way of knowinf instead of one.
[ link to this | view in chronology ]
Anonymous Coward, 27 Feb 2020 @ 12:57pm

"No one at Google on the Gmail spam team is thinking about promoting one Presidential candidate over another."

I really hope you're not that naive, but it seems you've hung your argument on this obviously false premise.
[ link to this | view in chronology ]
- Rocky, 27 Feb 2020 @ 1:13pm
  
  Re:
  Please prove that it is false. It should be easy since it's so obvious...
  [ link to this | view in chronology ]
- Mike Masnick (profile), 27 Feb 2020 @ 1:13pm
  
  Re:
  It's not a false premise. If you actually talked to people who work on those teams, you'd know it's true. But delusional, conspiratorial people assume otherwise. They're wrong, as are you.
  
  I can assure you that literally no one on the Gmail spam team is thinking "how do we promote Pete and harm other candidates." It's not a thing.
  [ link to this | view in chronology ]
  - Anonymous Coward, 27 Feb 2020 @ 1:21pm
    
    Re: Re:
    I don't know... there could be one person on the team thinking that way... but they'll get filtered out pretty quickly.
    
    At the management level, that's definitely not a driver, and speaking as someone who's worked on one of those teams, the entire UI and workflow is set up in such a way as you'd never even consider applying that thought process to the filtering. It just doesn't make sense.
    
    Could someone train the algorithms to have a bias? Yes... but that would mess up so much other stuff that nobody would want to throw THAT spanner in the works.
    [ link to this | view in chronology ]
  - Anonymous Coward, 27 Feb 2020 @ 1:26pm
    
    Re: Re:
    It may not necessarily be a false premise, but you appear to have created an unfalsifiable one.
    
    What do I mean? Well, let's suppose, hypothetically for the purposes this discussion, that Google did have their thumb on the scale. What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem? Would anything short of a confession (or leak of internal Google documents) work? If not, you have an unfalsifiable premise, and therefore (given the subject matter) a rather dangerous one.
    [ link to this | view in chronology ]
    - Anonymous Coward, 27 Feb 2020 @ 1:29pm
      
      Re: Re: Re:
      "What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem?"
      
      Oh, I don't know, how about...any evidence?
      [ link to this | view in chronology ]
      - Anonymous Coward, 27 Feb 2020 @ 1:56pm
        
        Re: Re: Re: Re:
        I'd want to hold Mike to that standard too, though, when we see broad statements like "nobody is thinking about it". I see no reason to believe Google's trying to promote any candidate, but I find it hard to believe these thoughts would not have occurred to literally anyone on the Gmail spam team. People should be thinking of ways these systems could be abused or could have unintended effects. Especially people working on moderation tools.
        
        BTW, does Google provide any data on how often people mark particular messages as spam?
        [ link to this | view in chronology ]
        
        Stephen T. Stone (profile), 27 Feb 2020 @ 4:21pm
        
        The phrasing maybe could’ve been better, sure, but the broader point still stands: Gmail’s staff isn’t actively trying to favor one candidate over another and anyone who makes that extraordinary claim needs to back it up with some extraordinary evidence.
        [ link to this | view in chronology ]
        
        PaulT (profile), 27 Feb 2020 @ 11:02pm
        
        Re: Re: Re: Re: Re:
        "I find it hard to believe these thoughts would not have occurred to literally anyone on the Gmail spam team"
        
        That's good, since the use of a colloquial term does not mean that it's never occurred to anyone ever. It simply means that nobody is consciously having the issue at the forefront of their mind when managing a system that has millions of uses outside of these campaigns
        
        "People should be thinking of ways these systems could be abused or could have unintended effects. Especially people working on moderation tools"
        
        ...and candidates' teams should be thinking about that when designing their campaigns, rather than (as is one suggestion here) simply whining when it turns out their competition is better managed.
        
        "BTW, does Google provide any data on how often people mark particular messages as spam?"
        
        Which messages? If you mean the specific emails sent by these campaigns, it's likely that few of them were sent to the spam box manually.
        [ link to this | view in chronology ]
    - James Burkhardt (profile), 27 Feb 2020 @ 1:44pm
      
      Re: Re: Re:
      The alternative premise Mike draws is that the content of Buttigieg emails are crafted to appear as email users really want, but that Warren emails appear more 'promotional', as not-quite-spam, rather than providing special treatment to Buttigieg.
      
      There are a number of ways to demonstrate either that gmail is providing special treatment to Buttigieg or that contents of these emails is a factor. For instance, Sending the contents of an allowed Buttigieg campaign email and a promotional Warren Campaign email from generic corporate emails could produce data on if the contents was a factor over the FROM header.
      
      But Mike doesn't need to prove his thesis. He is only producing a reasonable alternative explanation. He is not saying "this is how it is". He is saying the conclusions drawn from The Markup are half baked. They have not proved the implication that Google has their finger on the scale any more than Mike has proven his case. And Mike's Alternative idea, that rather than a shadowy coder assigning political emails to different filters to benefit Pete Buttigieg the 2 younger candidates (Buttigeg and Yang) hired email marketers that understand how to get an email around Promotions makes a significant amount of sense. And if there are not hidden hands helping Buttigieg, it makes sense he has one of the highest SPAM rates.
      [ link to this | view in chronology ]
      - Federico (profile), 28 Feb 2020 @ 1:55am
        
        Re: Message content vs. other factors
        Yes, but it's quite hard to perform such a test because you'd need to vary dozens of factors, including DKIM/SPF/DMARC configuration, source domains, IP addresses in Received headers, encryption, user-agent, HTML formatting, message IDs and references, volume and timing of emails across all of these, previous interaction between the sender and the recipient etc. etc. All of which is a moving target because you cannot know what the actual spammers are doing at any given moment which may change the impact of any of those variables.
        
        There are a few companies which claim to be measuring the deliverability in the wild but it's not clear how they do it or how reliable it might be.
        [ link to this | view in chronology ]
    - PaulT (profile), 27 Feb 2020 @ 10:41pm
      
      Re: Re: Re:
      "What evidence of such manipulation would have to exist in order to get past Masnick's Impossibility Theorem?"
      
      What evidence do you have that can't be explained with simple application of Occam's Razor?
      [ link to this | view in chronology ]
  - stine, 28 Feb 2020 @ 8:30pm
    
    Re: Re:
    Unless you personally asked each member of the GMail Team their political preference, you would have been better off to have started that sentence with "I don't believe that any of the GMail Team ..."
    
    While you may be correct, I'm going to take your statement with a grain of salt.
    [ link to this | view in chronology ]
Chris ODonnell (profile), 27 Feb 2020 @ 12:59pm

If I'm Mayor Pete's email marketing manager I asked for a very large raise yesterday.
[ link to this | view in chronology ]
Federico (profile), 27 Feb 2020 @ 1:31pm

Simpler explanations

The rest of the candidates... did not. And thus, their emails went to the promotions and spam folder because they had characteristics that are more closely associated with promotions and spam.

That was my guess before looking at the data, but after looking briefly at it I'd question whether this is really the case. The content and features of the emails might actually be rather similar; maybe Yang and Buttigieg used less popular software, not as well known to Gmail's filter as NationBuilder and ActBlue which are used by many of the others.
[ link to this | view in chronology ]
- Anonymous Coward, 27 Feb 2020 @ 8:22pm
  
  Re: Simpler explanations
  So in other words... Yang and Buttigieg developed emails that 1) are less likely to trigger the algorithm...
  [ link to this | view in chronology ]
  - Federico (profile), 28 Feb 2020 @ 1:48am
    
    Re: Re: Simpler explanations
    That suggests intentionality, which the data doesn't seem confirm. Also, if you look at the spam rates the Yang campaign is quite bad (maybe all that talk about 1000 $ sounds too similar to Nigerian scams? :-) ) and Buttigieg isn't so good either.
    [ link to this | view in chronology ]
  - Scary Devil Monastery (profile), 2 Mar 2020 @ 3:47am
    
    Re: Re: Simpler explanations
    "Yang and Buttigieg developed emails that 1) are less likely to trigger the algorithm..."
    
    That's what occam's razor provides for an answer.
    
    Now if I designed an anti-spam filter the very first thing i'd put in it was some of the key statements usually found in political boilerplate. If Yang and Buttigieg ended up NOT drawing on established campaign procedure their campaign emails are far more likely to not contain the specific trigger phrases the anti-spam algorithm has been programmed to register as "junk".
    [ link to this | view in chronology ]
Norahc (profile), 27 Feb 2020 @ 2:16pm

Google is absolutely manipulating the results. Those political emails from every candidate deserve to be sent straight to the trash can.
[ link to this | view in chronology ]
Anonymous Coward, 27 Feb 2020 @ 2:18pm

Really, anyone should just look at what gets tossed into the Promotional (or any other) category, and what does not. It's pretty obvious how crap the sorting is.
[ link to this | view in chronology ]
Anonymous Coward, 27 Feb 2020 @ 5:28pm

how google works aint open source, hunny.
[ link to this | view in chronology ]
- Anonymous Coward, 27 Feb 2020 @ 5:58pm
  
  Re:
  Relevance, sweetiebuns?
  [ link to this | view in chronology ]
- Anonymous Coward, 27 Feb 2020 @ 6:45pm
  
  Re:
  Funny thing - the people who hate Google in the RIAA and government? They're usually not fans of Open Source, either.
  [ link to this | view in chronology ]
That Anonymous Coward (profile), 28 Feb 2020 @ 12:18am

BUT BUT BUT THEY WERE ALL CAMPAIGN EMAILS!

The problem is not all campaign emails are the same.

Some follow a "tried & true" pattern that they have always used & OMG those get flagged as promotions b/c it looks like 5000 other promotion emails.

Sometimes the "tried & true" pattern is adopted by scammers to look more legit. (Side note 45 emails about my apple account sent to an email I rarely use... some look convincing).

Everyone wants to find a pattern of abuse to tie together all of the crazy ideas & make them be true... and not just unrelated things that sometimes happen.

People believe stupid things.
When the No Call List was suggested, they quickly moved to make sure political calls didn't get stopped.... because everyone loves politicians so much. I'm sure there were evil thoughts about how 1 party would block the other.
Why did they never consider making political calls opt-in?
Why not let people who don't want 300 surveys & campagin calls not get them??
If politics is so beloved of course people would opt-in!!
Oh and we need to keep those robocallers in business, even if in doing this we cleared the path for bad actors to get confirmed lists of numbers to abuse.
[ link to this | view in chronology ]
Anonymous Coward, 28 Feb 2020 @ 3:17am

Cool experiment, Crappy journalism based on a Shitty assumption
Yes, MM. Nice article (by you) and cool experiment (by them).

But the journalism is sensational rubbish. What fraction of a fraction of a percentage of US voters will make their voting decision from the subscribed emails?

Eh? Batshit crazy.

And the crazier bit is that people are actually taking this seriously. Its like RussiaGate (another hot tub of rubbish) has blown peoples minds apart.
[ link to this | view in chronology ]