Why Is Google Punishing Sites That Publish Full RSS Feeds? [UPDATED]
from the not-good-at-all dept
Last year, we explained why full text RSS feeds make sense. You can read the whole thing, but the short version is that it makes it easier to read, and that means more people actually read the full stories and are willing to discuss them, share them and get others interested in reading as well. It just makes the reading experience that much better. We've always had full text RSS feeds, and we're not about to change that. However, it appears that Google may be punishing sites that have full text feeds. A concerned reader pointed us to the news that the magazine Mental Floss has reluctantly ditched its full text feeds because Google banned the site and told them the only way to get back in was to get rid of the full text feeds. Update: Matt Cutts from Google has responded in the comments and explained what happened. Turns out, despite the original post, it had nothing to do with full text RSS feeds, but the site was hacked. I'm glad that's been cleared up now (and thanks to the multiple Google employees who quickly responded to this post).I could understand if the deletion of Mental Floss from the index was simply a mistake, and upon being alerted to it, they restored the site. But the fact that Google's response was to tell Mental Floss to ditch the full text feeds is worrisome. What makes this even more ridiculous is that Feedburner, which is owned by Google, tells people that full text feeds are better. So, you have part of Google telling people to use full text feeds, and another part of Google punishing them for doing so.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: duplicator sites, full feeds, partial feeds, rss
Companies: google
Reader Comments
Subscribe: RSS
View by: Time | Thread
Google doesn't like "duplicate content." Publishing full RSS feeds saves the spammers the trouble of having to use scrapers to get all your content and throw it up on their sites, thus creating duplicate content. They weren't being penalized for having full RSS feeds, it's just a by-product of being an easy target.
Google does try to identify the "real site" (the origin of the content) and give that site the credit. The problem is it's done algorithmically and that it's not perfect. Shedding light on cases like this is good, because it gives Google the opportunity to improve on their errors. TechDirt.com as a domain is probably strong enough a source that Google has no trouble figuring out the origin of the content. Mental Floss obviously wasn't as fortunate.
[ link to this | view in chronology ]
Re:
The only news here is that Google's algorithm isn't perfect and through a fluke it banned a perfectly legit site. Google is still not evil.
[ link to this | view in chronology ]
what part of RSS do they not get?
[ link to this | view in chronology ]
The site was hacked. RSS has nothing to do with it.
Here's some of the email that we sent on July 7th to this site owner:
Dear site owner or webmaster of mentalfloss.com,
While we were indexing your webpages, we detected that some of your pages were using techniques that are outside our quality guidelines, which can be found here: http://www.google.com/webmasters/guidelines.html. This appears to be because your site has been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index.
The following is some example hidden text we found at eg:http://www.mentalfloss.com/blogs/archives/2192:
economics times india
The application fee is collected by the JUPAS economics times india on behalf of the 9 participating institutions and is not refundable or transferable to another year.
free 2004 income tax forms
Request for use of Accumulated Surplus must be signed by the Hon Fin Sec/Treasurer and countersigned by the President of the Union/Club and submitted to OSA for approval. According to the agreement, Castrol will use Deutsche Bank's complete end-to-end payment and collection solution, as well as db-eBills - the Bank's innovative electronic invoice presentment and payment (EIPP) solution. The Internet's largest source of legitimate, copyrighted 100% digital sheet music since 1997, we now have over 10,000 songs for instant download! For extremely poor families, free 2004 income tax forms provides emergency assistance, while the conditionalities promote longer-term investments in human capital. Australia order viagra online clinic uk in Australia order viagra without a prescription in Australia order generic viagra and other prescription drugs online in Australia viagra order by phone in Australia viagra order on line in Australia order cheap viagra in Australia levitra cialis viagra comparison online order in Australia buy online order viagra in Australia order generic viagra in Australia order viagra overnight in Australia order by phone generic viagra in Australia viagra no prescr
chase mastercard rewards program
A device which forms a digitised image of a human fmger print for the purpose of biometric authentication. T subject to search without a warrant while on prison property, according to the lawsuit. It is rare to find an amateur player using this move in a poker game, so if your opponents see you using this move they can be fairly sure you know how to play good poker, and may think twice about bluffing you out of future pots. Download one of listed teens for chase mastercard rewards program taylor torrents or choose from category bit torrent downloads listed here to download your favorite torrent at torrentz. ACI Worldwide Eastern Europe Development is the fast-growing Romanian branch of ACI Worldwide.
bad credit personal finance loans
[...]
In order to preserve the quality of our search engine, we have temporarily removed some of your webpages from our search results. Currently pages from mentalfloss.com are scheduled to be removed for at least 30 days.
We would prefer to have your pages in Google's index. If you wish to be reconsidered, please correct or remove all pages (may not be limited to the examples provided) that are outside our quality guidelines. One potential remedy is to contact your web host technical support for assistance. For more information about security for webmasters, see http://googlewebmastercentral.blogspot.com/2007/09/quick-security-checklist-for-webmasters.html.
When you are ready, please visit https://www.google.com/webmasters/tools/reinclusion?hl=en to learn more and submit your site for reconsideration.
Sincerely,
Google Search Quality Team
[ link to this | view in chronology ]
Matt has the answer!
There was another article I read that is related to this one. Basically, the idea is that there is some sort of authentication for a publisher with Google. Then Google knows that this source is the authority on that subject, and that site will get the higher ranking. I think this is a good idea too.
[ link to this | view in chronology ]
By the way, full-text RSS feeds are great
[ link to this | view in chronology ]
Unlikely
Google's algo for duplicate content is usually very good at detecting the original source and weeding out the scraper sites. Now, this could be a mistake from Google not getting the right site, but I did a very, very cursory check and noticed one glaring thing:
CNN.com has quite a few of their articles published in full in what looks like some sort of cross promotional thing. Google gives a lot of weight to CNN and other big news sites as trusted originating sources. I think it is much more likely that Google's algo is viewing CNN as the original and Mental Floss as the duplicate.
As I said, I did only a cursory look, but it is highly unlikely that scraper sites are causing the delisting. It is much more likely sites like CNN with dupe content or even some other violation of Google's guidelines.
Unless your start selling links or something else against Google guidelines, techdirt has nothing to worry about publishing full feeds.
[ link to this | view in chronology ]
oooooh
[ link to this | view in chronology ]
A note from mentalfloss.com
Once we realized we were no longer in Google's natural search, we immediately began taking steps to try and figure out what was going on. After asking a few others with experience in this area, it was suggested to us that we make sure no one was lifting our content from our RSS feed and publishing it in full on their site. We discovered another site that was and decided to tweak our RSS feed just in case that was the cause.
We are continuing to look into this and will resolve the problem Matt has pointed out.
It's very important to us that we are included in Google's index again so we'll work quickly to get this fixed. It's unfortunate because we run a clean operation so I hate that this has happened.
But again, this is not Google's fault. They've simply recognized a problem and we'll work to fix it.
Matt, if you'd be willing to discuss, I would love to have a conversation with you. Thanks for your attention to these matters.
Thanks,
Will
[ link to this | view in chronology ]
Re: A note from mentalfloss.com
Best wishes,
Matt
P.S. Mike, thanks for the quick update on this story.
[ link to this | view in chronology ]
Interesting
[ link to this | view in chronology ]
one other note from mental_floss
[ link to this | view in chronology ]
Search Team Letter
For those of us who are just trying to get people who want to see our content to our website, without wasting the time of people who don't want to see our content, meaningful search results are in the best interest of ourselves, our visitors, and Google. For example, we're still getting way way too much Nina Hartley traffic. It's not valuable to us, and proababy an annoyance to people searching for [nina hartley] and ending up at the website of an erotic documentary company.
I wonder if there are other ways that Google could send e-mail to site owners to help the identify problems or otherwise tune their websites. Yes, I know about all the documentation available from Google, but letters like the above that address site-specific problems could be helpful to everyone.
[ link to this | view in chronology ]
I am facing this problem
[ link to this | view in chronology ]
Is linking back the answer?
To ensure that the full RSS text movement is not slowed, is it correct to assume that you can avoid duplicate content or the situation @venkatakrishna is facing, provided that anyone using your content links back to you?
[ link to this | view in chronology ]
Linking back helps
http://googlewebmastercentral.blogspot.com/2008/06/duplicate-content-due-to-scrapers.html
http://www.vanessafoxnude.com/2008/05/14/ranking-as-the-original-source-for-content-you-syndicate/
Hope that helps,
Matt
[ link to this | view in chronology ]
We Love Republishing TECH DIRT
[ link to this | view in chronology ]
[ link to this | view in chronology ]
منتديات مملكة الرومانسيه لعشاق الرومانسيه
[ link to this | view in chronology ]
health
[ link to this | view in chronology ]
Google Ranking Policy
[ link to this | view in chronology ]