Kindle Spam Is A Filter Issue, Not A Spam Issue
from the filter-away dept
Via Slashdot, we learn that spammers have discovered the ability to publish cheap "ebooks":Thousands of digital books, called ebooks, are being published through Amazon’s self-publishing system each month. Many are not written in the traditional sense.The article makes it sound like this is a big problem, calling it "the dark side" of self-publishing, but I don't get it. Assuming no one wants this crap, then it seems likely that Amazon will start to filter it out of any search results or top lists.
Instead, they are built using something known as Private Label Rights, or PLR content, which is information that can be bought very cheaply online then reformatted into a digital book.
These ebooks are listed for sale – often at 99 cents – alongside more traditional books on Amazon’s website, forcing readers to plow through many more titles to find what they want.
There is some slightly more legitimate concern about outright plagiarism, where some of these "spammers" are merely copying other books and then re-branding them and selling them as ebooks. But, once again, this seems like a filter problem more than anything else. In fact, I'm a bit surprised that Amazon doesn't do a basic check to make sure the content of an ebook hasn't already been offered by someone else, and do a further investigation if that's the case. Others have suggested that Amazon charge a small fee to upload a book, as that might prevent spammers from going crazy with such copies, and that could make sense as well. I just have trouble believing that this is such a serious "problem" that it can't easily be stopped.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Reader Comments
Subscribe: RSS
View by: Time | Thread
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
also: that's what filters are for. give each entry a bunch of filter tags for various things and let the person looking for stuff browse by filter catagory as well as searching. or even search within a catagory. in a lot of contexts good catigorisation is more useful than a search engine for finding what you want. especially when you've got a less than complete understanding of what that Is.
[ link to this | view in chronology ]
Re: Re:
In fact, requiring a fee would be the "first filter". If you don't think a book is good enough to earn back $99, it's probably not good enough to be on the store in the first place.
Some guys has spammed the store with over 8,000 titles ripped off from hither and yon and selling for a buck a head. Would he have still done that if doing so would have cost him $80,000 up front?
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
Just looking at the raw numbers, 8000 titles @ a buck a piece, seems like that could provide a pretty decent revenue stream even at a low percentage.
[ link to this | view in chronology ]
Re: Re:
Me: I don't think so. It is a question of ratio. If you have 100 good works, and 10 bad ones, your signal to noise ratio is high and you have no problems. But if that shifts to 100 good works and 1000 bad ones, the chance that you find the signal (same level as before) is very low.
In a world where anyone can publish anything at any time with little or no real cost, you will get more noise. Spammers, jammers, and scammers will figure out how to make money in the noise, and the noise increases.
One only has to look at the number of twitter bots, automated posters, automated follow bots, and auto-retweeters to see where the noise comes from.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to. But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials.
[ link to this | view in chronology ]
Re:
Head shot
[ link to this | view in chronology ]
Re: Re:
Oh, wait, YOU CAN'T.
[ link to this | view in chronology ]
Re:
He doesn't expect it to be a government imposed requirement, just a good business model suggestion. Also, Youtube already does such basic checks that attempt to identify infringing materials, but there are almost always ways around these checks. MM suggests that Amazon should do more to reduce plagiarism, not that it will ever be stopped completely.
"But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials."
Maybe copyright shouldn't exist to begin with.
[ link to this | view in chronology ]
Re:
"do a further investigation".
In Youtube v Viacom, Viacom was expecting all ther content to be removed without investigation.
Can you spot the difference between the teo?
I do agree, however, that it would be silly to police on the behalf of content 'creators', as that would be a dangerous precedent (if not legally, then rationally).
[ link to this | view in chronology ]
Re:
YouTube is also perfectly in the right if they choose not to police their uploads either, because it is ALSO a business choice to pre-police their content.
The problem here, that you don't seem to understand, is that no one has the right to demand that Amazon or YouTube make the choice one way or another. However, if the problem really is so bad for Amazon, that drives customers away, in which case it is a problem that a smart business would address.
[ link to this | view in chronology ]
Re:
No. I expect Amazon to have the database of existing ebooks in its store, because it already has that. That's all.
And then I'm not expecting them to compare for copyright, issues and automatically block, but as I said (I thought clearly, but perhaps not), all it should trigger is further investigation.
Not sure where the confusion comes in. Apologies if I wasn't clear.
[ link to this | view in chronology ]
Re: Re:
Exactly, there is a difference between knowing whether you have duplicate information and knowing what information does and doesn't infringe. It's much easier to know that you have duplicate information than it is to magically know that some segment of information infringes. The later requires a psychic, the former simply requires some comparison tools.
[ link to this | view in chronology ]
Re:
Implementing a filter to help better improve search results for ease-of-access to customers is a business decision. It's not about copyright - just profits. I find it hard to believe Mike seriously thinks Amazon should do any sort of policing over the content itself, only that they should verify whether the content in question is indeed what it's labeled as. If for no other reason to avoid false advertising repercussions.
If you're trying to find an e-book and you get 40 false results for every positive it's going to be pretty annoying pretty quick to look for what you want. Find a different vendor who has only positive results and you'll probably just shop from there instead and give them any future business.
The biggest question is: how much money is Amazon making off the sale of all the spam? I would have to guess not very much, but if it's somehow generating revenue for them then no point shutting it down.
Asking third-party aggregate sites such as youtube or torrents to police and enforce copyright law is first and foremost granting them too much authority to declare what is or is not infringing. Second, the cost of implementing any form of workforce to go over the amount of data being uploaded to these sites would make it impossible for any sustainable service of the sort.
You cannot reasonably scrutinize thousands of terabytes of data without creating digitized signatures of the files that are being uploaded. If the file has a copyright on it then you are thereby violating that copyright by using an unauthorized copy of the work in your filtering software without express written permission by the content holder.
Either way - it's all a matter of profits. Do whatever makes you the most.
[ link to this | view in chronology ]
Re:
No need for an "official list". Just hash the content of every book at upload and block any whose hash is not unique.
You can be pretty sure that among these 1000's of PLR books there will be a lot of duplication.
[ link to this | view in chronology ]
Why should someone have some exclusive right to publish a book just because they wrote it? If you wrote a book and someone does a better job of publishing it than you, why do you deserve to make any money from it at all? Execution matters, not the idea. Besides which, if someone sees your book published by someone else and likes it, and you can't figure out how to make money that way, whose fault is that? They are giving you free promotion by selling your book for you. Get with the times, copyright maximalists!
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
This is not a new problem...
Unsurprisingly, the same scumbags who are engaged in this are also engaged in other abuse: link farming, content farming, SEO, Usenet spam, email spam, IM spam, text spam, etc. And one of their current strategies seems to be to combine these modalities into integrated "campaigns" designed to annoy as many people as possible.
[ link to this | view in chronology ]
Amazon filtering based on content
I get the impression that we are in the famous "I don't know how to distinguish it, but I know it when I see it" (not an exact quote, I am sure) quote.
Wouldn't it be nice if we had consistency? To much to ask, I assume.
[ link to this | view in chronology ]
Re: Amazon filtering based on content
[ link to this | view in chronology ]
The problem is that it is like sticking your finger in a dam. Another hole quickly opens up. I applaud Amazon for tightening up their publishing standards and at least removing the PLR books people were submitting. But they are running uphill against this epidemic....
[ link to this | view in chronology ]