Content Moderation Case Study: Amazon Alters Publishing Rules To Deter Kindle Unlimited Scammers (April 2016)
from the it's-always-the-scammers dept
Summary: In July 2014, Amazon announced its "Netflix, but for ebooks" service, Kindle Unlimited. Kindle Unlimited allowed readers access to hundreds of thousands of ebooks for a flat rate of $9.99/month.
Amazon paid authors from a subscriber fee pool. Authors were paid per page read by readers -- a system that was meant to reward more popular writers with a larger share of the Kindle Unlimited payment pool.
This system was abused by scammers once it became clear Amazon wasn't spying on Kindle Users to ensure books were actually being read -- i.e., keeping track of time spent on pages of text by readers or total amount of time spent reading. Since Amazon had no way to verify if readers were actually reading the content, scammers deployed a variety of tricks to increase their unearned earnings.
Part of the scam relied on Amazon's willingness to pay authors for partially-read books. If only 100 pages of a 500-page book were read, the author still got credit for the 100 pages read by an Unlimited user. Scammers inflated "pages read" counts by moving the table of contents to the end of the book or offering dozens of different languages in the same ebook, relying on readers skipping hundreds of pages into the ebook to access the most popular translation. Other scammers offered readers chances to win free products and gift cards via hyperlinks that brought readers to the end of the scammers' ebooks -- books that sometimes contained thousands of pages.
The other part of the scam equation was Amazon's hands-off approach to self-publishing. Amazon has opened its platform and appears to do very little to police the content of ebooks, other than requiring authors to follow certain formatting rules. Amazon is neither a publisher nor an editor, which has created a market for algorithmically-generated content as well as a home for writers seeking a distribution outlet for their bigoted and hateful writing.
Once Amazon realized the payout system was being gamed, it altered the way Kindle Unlimited operated. It began removing scammers, notifying authors and customers that it was doing this in response to Unlimited readers' complaints.
Some in the community have contacted us about the activities of a small minority of publishers who may attempt to inflate sales or pages read through the use of various techniques, such as adding unnecessary or confusing hyperlinks, misplacing the TOC [table of contents] or adding distracting content.
Unfortunately, Amazon's moderation efforts did affect a very small number of legitimate authors. Writer Walter Jon Williams was blocked from selling his ebook because his table of contents was located near the end of his book. Williams pointed out he had done this to maximize the amount of content prospective readers/purchasers could access using Amazon's "Look Inside" feature. After some back-and-forth, Williams' book and buy button were restored by Amazon.
Amazon continues to work to minimize abuse of the Kindle Unlimited system. The most noticeable and major change has been to cap earnings at 3,000 pages per ebook per reader. This limits the amount of money scammers can pull from the Unlimited payout pool. It also limits the number of times Kindle Unlimited readers will find themselves scrolling through ebooks solely designed to inflate page counts.
Decisions to be made by Amazon:
- Can automated moderation alone determine whether an uploaded ebook is a legitimate offering?
- Does altering the payout rules for Kindle Unlimited negatively affect legitimate authors?
- Does the ongoing abuse of various Amazon ebook programs justify more data collection on customers and their reading habits?
- Should authors be notified ahead of changes to Amazon services or would more transparency result in more abuse by scammers?
- Does the flat rate subscriber fee cover the costs of policing an ebook publishing ecosystem of this size?
- Who deserves more protection? Sellers/writers or customers? How do you strike the correct balance that provides more value to both sides of the transaction?
- Is more vetting needed on the front end (ID verification, etc.) to prevent further abuse?
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: content moderation, kindle, kindle unlimited, scammers
Companies: amazon
Reader Comments
Subscribe: RSS
View by: Time | Thread
Is the amazon payment model really relevant to this case?
It seems that whatever method to allocate the money would be gamed by scammer (like Goodhart's law) and maybe even shape how honest authors shape their published work to squeeze a bit more money.
For example, if the payment was from book read (whatever counts as read), both scammer and authors would offer more and shorter books to inflate the variable used to distribute the money.
At least, they tried to keep it simple at the beginning and then started dealing with bad behavior after it became rampant, instead of just trying to make complex rules and spy harder on users' data.
[ link to this | view in chronology ]
Bayesian filtering
I'd hope Amazon was applying this already, but Bayesian filtering worked pretty well (still works pretty well, in fact) for separating spam from non-spam email. The Kindle store should be able to provide good-quality large samples of both actual books (pull from known authors and books which have been published on dead trees) and generated content, I'd honestly start by taking those samples and using them to initialize
bogofilter
, then feed it a selection of test books and see how accurate it's classification was.After classification, there's some heuristics that can be applied. If an author account is long-standing and doesn't have any scam-content flags, it's probably safe to just list any new books regardless of what the filter says. If they're uploading a lot of works over a short time-frame, check whether that author's got hardcopy-published works. If they do (or they don't and the filter says they're mostly or all scam content) it's probably safe to just go with the filter results, otherwise flag the lot for manual review because it's anomalous behavior.
[ link to this | view in chronology ]
The spammer/troll's motive is relevant. In this case, it seems to be greed. Where it's hatred or one of the other notoriously-deadly motives, some other approach to disencentivizationalizing the contemptible behavior might be appropriate.
That's why responding to a troll is generally the worst thing you can do (volunteers for porcine mud-wrestling, anyone?) Or harassing a sociopath with a victim-complex (come see the violence inherent in the system!)
This may seem strange, but it's almost as if you had to look on each one as a person ....
[ link to this | view in chronology ]