Google Books Data Mining Reveals Mad Men's Big Historical Flaw: Business Lingo
from the keep-a-low-profile dept
The TV show Mad Men has quite a reputation for going to great lengths to be as authentic as possible. The clothes, the props, the scenarios are all supposedly thought out in great detail. While some who were actually in the business at the time quibble with certain aspects of the show, it cannot be denied that the show's producers certainly go way beyond other period pieces to try to make keep everything accurate for the time period. However, it turns out that there's one area where it appears the writers have completely flopped: period-specific language. On The Media ran an absolutely fascinating clip about a researcher who has shown how frequently Mad Men uses words or phrases that were not in popular usage at the time, but only came into the lexicon at a later date:This is actually a cross-broadcast of another podcast, Lexicon Valley, and it's covering the work of Ben Schmidt, who has produced a software algorithm that compares the Mad Men scripts... to a searchable database of language from Google's book scanning project. Schmidt's algorithm compares the language from the show with scanned books from the same period. Schmidt has a website, Prochronism, which covers his findings. I can't quite explain why, but it's really quite fascinating.
Schmidt has found that the show is pretty good about getting language about technology right (with one exception). It knows that there aren't fax machines and computers and stuff. The one area where it gets things wrong, is with the phone. For example, using the phrase "on hold." He notes that phones had hold buttons, but there wasn't yet a concept of the state of being "on hold." That showed up in the 70s.
What Schmidt has also found is that the show is absolutely terrible about getting "business" terms correct in a period specific way. That same post about "on hold" also chides the show for using "defining moment," another phrase that showed up in the 70s, but was basically stuck in academia until the late 80s or early 90s when it became a popular phrase.
Honestly, Ben's site is really fascinating. I could spend hours on it (and actually had to stop going through it post by post to finish this post). There are also discussions on phrases like "focus groups" and "leverage." But one more awesome chart from Ben, discussing the use of both "moral high ground" and "consumerism," both of which were barely in use until much later:
On the podcast, they discuss how part of the reason that the show gets the language about technology right, but not business, is because we know that technology rapidly evolves and we're more attuned to it. But people don't pay nearly as much attention to how business changes and especially how the language of business changes over time. I guess that's true, though it doesn't surprise me that "consumerism" and "moral high ground" are both more recent phenomena. "Defining moment" and "on hold" are a bit more surprising to me.
Either way, I also wanted to highlight something else about all of this that I find fascinating. For all the talk by some about just how evil Google's book scanning project is, this kind of effort and research wouldn't be possible without large scale scanning of books. While this particular example may appear (on its face) to be a frivolous (even if it's fascinating) area of research, it does highlight just how collection of certain data can open up vast arrays of data that can be mined in useful ways. When people freak out about new technologies and services, they almost always focus on how it impacts the old offerings. So most of the talk was about book scanning and its impact on book sales. But what almost no one talks about is how it enables new things that simply weren't possible before -- such as being able to build an algorithm like the one Ben built. Those kinds of innovations -- the unexpected "externalities" of projects like the Google book scanning project -- shouldn't be ignored, because there's tremendous value that can come out of them.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: ben schmidt, data mining, google books, history, language, mad men, television
Reader Comments
Subscribe: RSS
View by: Time | Thread
http://thetrichordist.wordpress.com/2012/06/18/letter-to-emily-white-at-npr-all-songs -considered/
This essay blows a giant whole in the entire premise of this blog. I'm not surprised you're unable to address it. Bottom line: copying music is unethical.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
2. Incorrect homonym.
3. You apparently don't understand the premise of this blog.
[ link to this | view in chronology ]
Re:
"Bottom line: copying music is unethical."
Well, that is a compelling and well thought out argument. /s
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
There's a submit a story link at the bottom of the page.
Personally I'm hoping Mike does pick up on this because Lowery's letter, whilst being considerably more polite than most of his rants, has so many holes in it that it's pretty excruciating reading.
[ link to this | view in chronology ]
Re:
Unethical to whom? Sorry, you are not *my* moral authority.
[ link to this | view in chronology ]
Re: Re:
Go ahead Emily, keep on sharing. Stand up for your natural rights and tell the government-made rights to stuff it.
[ link to this | view in chronology ]
Re:
Since you asked for a response, how about you respond to the economic issue...
During this time period when the 'average income of a musician' was 35K a year (this is from your article, you find the reference and period), What was the average 'record executive's' salary? How about the average 'label executive'? How about the average **AA cut?
Please show us how well the musicians were doing compared to the 'fat middlemen' in this era so that we can properly judge today's artists and middlemen...
[ link to this | view in chronology ]
Re:
Meanwhile have fun with your trichorder. If you play an instrument as well as you construct an argument I want to know where you're playing live near me so I can get as far away for the site as I can.
[ link to this | view in chronology ]
/s
[ link to this | view in chronology ]
Language of Today
While it is a show about the past, it is also a show FOR the present. To a degree, it must be written in the language of today. To much adherence to past language would make the show indecipherable to a lot of people. If the language leaves you confused, that's as much a breach of the fourth wall as an egregious anachronism.
The job of a fiction writer is to be convincing, not to be accurate.
The site is definitely an interesting analysis of language evolution. Comparing the real past to our fictional ideas of the past is a fascinating and worthwhile exercise. However, to use that analysis to chide a fictional show for being inaccurate is silly and pointless.
[ link to this | view in chronology ]
Re: Language of Today
[ link to this | view in chronology ]
Re: Re: Language of Today
Again, it's about balance. The language must be genuine enough to be convincing. But as fiction, it does not bear a responsibility to be real-world accurate.
Therefore, accuracy that serves the story and the atmosphere is important. Essential even. Same goes for accuracy on items that are familiar to modern culture. But "accuracy" that gets in the way of telling the story does the viewer a disservice. It's not a documentary. It's fiction. Story and immersion are paramount. Accuracy is far down the list, as long as the "mistakes" are not something that stands out to the viewer.
Let me ask you this: prior to the podcast, did you ever think, "I'd enjoy Mad Men a lot more if the business lingo sounded more like stuff I've never heard and had to decipher."
[ link to this | view in chronology ]
Maybe that should be was, as with all useful projects / technology, the gate keepers have broken it. It still works a little, but when you think of what might have been.....
Humanity as a whole could have benefited so much, at an admitted price to the gate keepers, but those 19th century folks need to move aside.
[ link to this | view in chronology ]
negative space
But what about the mastodons? The phrases that should be there but aren't, and we don't notice because they're extinct today? The "-wise" suffix (used to turn nouns into adverbs, coinage-wise) was a bit of newspeak that was so common on Madison Avenue it was made fun of at the time. It should be easy enough for a writer to scan some old magazines for good examples, and they could serve a useful purpose in a script, underscoring a character's immersion in vogue, or sad devotion to a movement that has no future. (Here's hoping that the young audiences of 2060 don't notice the lack of "war on terror" references.)
[ link to this | view in chronology ]
P.S. hey Phil, shut your "whole".
[ link to this | view in chronology ]
Google n-gram
What we learned from 5 million books
and they also mentioned that this is just metadata - not actual data from books and therefore not copyrightable (correct me here; not sure about this).
[ link to this | view in chronology ]
[ link to this | view in chronology ]
I agree, but in one particular use of business language I did notice some of the changes - the reasons for layoffs:
- "middle management reduction"
- "downsizing"
- "re-engineering"
- "outsourcing"
- "cost cutting"
and lately - "the economy"
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
I'd like to see a similar study of "Deadwood"
Apparently the producers did their best to keep all the other parts of the episodes as close to that time period as possible, but unfortunately for them, the words that were shockers back in those days aren't shocking anymore, so they had to "update" the cussing.
[ link to this | view in chronology ]
Re: I'd like to see a similar study of "Deadwood"
Sort of like the shocked response of young male English speakers from North America when the nice daughter of the family at the B&B asking you when you want to knock her up in the morning. ;-) Translated that means "wake you up".
[ link to this | view in chronology ]
Re: Re: I'd like to see a similar study of "Deadwood"
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Forever morphable English -- And it's free!!
I'd never thought of using books as a way of preserving the lexicon of the time but it makes perfect sense. For a story to sell well it needs to speak to English speakers in the lexicon being used at the time.
Even more to the popular songs of any given period which is why I'd love to see a similar study done on song lyrics from, oh, say, 1940 to 1980 to see who words and phrases changed in that period.
We're so used to Shakespeare that we forget that he invented words as he went along if he couldn't find one that fit. Which may be why 500 years later studying Shakespeare by reading his plays is hopeless while to see a performance we suddenly understand a whole lot of things we didn't just be reading it.
I do wonder about the phrase "on hold" slowly becoming "on ignore" mostly when confronted by one of those damnable choice trees that take forever to get through if we wait that long.
And let's not get into dialect in this language. :-)
[ link to this | view in chronology ]
Copying Not So Easy
Copying is difficult. Innovators should stop worrying about someone else swooping in and copying what they are doing. Big companies, who are the threatening copiers that everybody is worried about, cannot copy successfully. They suffer from a "not invented here" syndrome. They never put in enough work on the details. They are arrogantly convinced that they have got something right, when there are still errors. Innovators should just concentrate on getting their own execution right and ignore the possibility of copiers. That is why the patent system is misguided. It prevents copying, at huge expense, when copying is a non-problem in practice.
[ link to this | view in chronology ]