Are Yahoo & The AP Manipulating Comments? Or Are They Just Really Bad At The Internet? [Updated]
from the do-you,-uh,-yahoo? dept
Someone who prefers to remain anonymous sent over this story about how Associated Press stories hosted on Yahoo News appear to have tons of comments from old stories. It's not entirely clear what's happening, though I have my suspicions (explained further down), but it appears that when new stories are showing up on certain topics, Yahoo is simply copying over older comments from previous stories on similar or related topics. The comments look as if they're about the story posted -- and the only way you can tell they're not is if you notice the date:I'd go from one Yahoo article to another and notice that regardless of the subject matter, the first user comment was always the same -- at least on AP articles covering the Israeli/Palestinian conflict. The comment that kept reappearing was posted by "Robert" and it was a one liner. "Hamas is now in control of the Gaza Strip after winning an election there against Abbas Palestinian Authority." That was it. Fair enough -- I've got no quarrel with the messenger or the message. But somehow that one comment generated an incredible 184 responses and, last I checked, readers had given it 3212 thumbs up and 2525 thumbs down.Oddly (and inexplicably) the author of that post, Ahmed Amr, does not link to Yahoo to show this, but it's not hard to find. Here's a story published on June 3rd, 2010 at 9:19pm. Yet, there's that same first comment, from March 9th, at 12:47am. And here's a story published on May 6th at 1:09 pm with the identical comments, also beginning with the March 9th comment. To let you see what they both look like before they change (and I'll explain in a second why I think they'll change) I've turned both of those pages into PDFs, which you can see below (you may have to either download or view at full screen and scroll to see the "comments" at the bottom):
I got a little curious about why Robert's one liner had generated so much controversy. I've written hundreds of articles and never got anywhere near that kind of attention. Frankly, I was full of envy. How did 'Robert' pull this off with one miserly line? Then I noticed the strangest thing: it was dated March 09, 2010. The comment was two months old and was the lead comment of 40,000 responses. That seemed a little high considering the fact that the AP article I was reading had only been posted for thirty minutes.
What were Yahoo and AP up to? The answer is simple; they were porting comments from one article to another and, in this particular case, they've been doing it for two months.
Take a look at the two links I put above to the Yahoo stories. The URLs (as found by a quick search for the comment string Amr mentioned in his post) are as follows:
- http://news.yahoo.com/s/ap/ml_israel_palestinian
- http://news.yahoo.com/s/ap/20100506/ap_on_re_mi_ea/ml_israel_palestinians
After the date of publication, breaking the basic principle of a link to a news story being a link to that news story alone, Yahoo moves the story to a new date-defined directory, and the original URL is freed up for the next story on that particular topic. If this seems stupid and confusing to users and destructive to the very idea of the "link economy" or valuing earned or passed links, you're right. But take that up with Yahoo and the Associated Press.
Of course, here's where the real level of tech incompetence comes in: It appears that Yahoo News' comment system doesn't understand that Yahoo does this. So, it associates the comments to that last bit of the URL string "/ml_israel_palestinian" and the same comments will appear every time that string is used as the final part of a URL string. It's bizarre that Yahoo would do this, but apparently, that's how Yahoo rolls.
Amr suggests that this is part of a planned bit of "corporate fraud" by Yahoo and the AP, perhaps to make it look like certain stories are getting a hell of a lot more comments than they are. He also suggests other conspiracy theories involving pro-Israeli operatives, saying that as far as he can tell, this only happens on AP stories concerning the Israeli/Palestinian crisis. I believe Amr didn't try very hard to find alternatives. On my very first attempt to find an example related to something entirely different, I found the identical behavior. I just picked a popular story that likely would have multiple stories over multiple days: the BP oil spill in the Gulf. Then I looked for an AP story hosted by Yahoo News... Bingo.
The first news story I found was published on June 3rd at 2:28 pm, but the first comment on the story? Why it's from May 1st at 2:06am. And the URL? The string ends with "us_gulf_oil_spill_947." You can find the identical comments on this story which was published May 21st, but ends with the string "us_gulf_oil_spill" suggesting that Yahoo's comment system also ignores numbers at the end of that final URL part in replicating its comments.
And here's another story about the White House's response to the oil spill. Published June 3rd at 11:57 pm. First comment? May 10, 2010 12:58 pm. URL string? "us_gulf_oil_spill_washington_9". And here's a story from May 17th with the identical comments at the end, with the closing URL string "us_gulf_oil_spill_washington_1." Yup, Yahoo seems to just match up comments with pretty simple URL hashes.
You can see all of that below as embedded PDFs:
Update: The AP got in touch to make it entirely clear that this is entirely Yahoo's incompetence and not its own:
The Associated Press distributes news content to Yahoo! News, but the display of AP stories and the curating of comments are entirely up to Yahoo!While undoubtedly true, in the comments we've heard from multiple people who work at news sites that license AP content, and they note that AP has a weird feed process, whereby it gives a simple slug like the ones used above, so that it can force update stories, often leading people to see stories totally change over the course of the day. This is clearly a Yahoo issue, but AP's policies don't help.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: comments, news
Companies: associated press, yahoo
Reader Comments
Subscribe: RSS
View by: Time | Thread
Props
Ha!
Mike you get an all around A for figuring out how Yahoo's system works and everything. Nice bit of figuring things out and explaining them. I will definitely attribute Yahoo's flaws here to bad coding and not malice. I think there's a razor for that?
[ link to this | view in chronology ]
Re: Props
; P
[ link to this | view in chronology ]
Re: Re: Props
It seems that Napolean said "Never ascribe to malice that which is adequately explained by incompetence."
But the wording that went through my mind was Hanlon's Razor:
Never attribute to malice that which can be adequately explained by stupidity.
[ link to this | view in chronology ]
Re: Re: Re: Props
When competing hypotheses are equal in other respects, the principle recommends selection of the hypothesis that introduces the fewest assumptions and postulates the fewest entities while still sufficiently answering the question.
[ link to this | view in chronology ]
Re: Re: Re: Re: Props
The two are similiar, but Hanlon's razor is specific to questioning whether somebody's actions are intentionally malicious or unintentionally stupid. Although, I suppose you could make a case that a conspiracy theory is an introduction of an unnecessary element in a theory, and thus could fall under Occam's razor as well...
[ link to this | view in chronology ]
Nice catch!
[ link to this | view in chronology ]
Re: Nice catch!
She just recently said that it's hard to change things at a big company.
http://venturebeat.com/2010/05/24/carol-bartz-techcrunch/?utm_source=feedburner&utm_ medium=feed&utm_campaign=Feed:+Venturebeat+(VentureBeat)&utm_content=Google+Feedfetcher
[ link to this | view in chronology ]
Re: Nice catch!
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
stories that are heavily commented garner more traffic.
rather than:
stories with heavy traffic garner more comments.
and this was their misguided solution to pad comment numbers.
[ link to this | view in chronology ]
Fail
[ link to this | view in chronology ]
What bothers me is the website where the guy begins to name people and trying to call for action.
I mean really? Do you have to think it's a conspiracy when your documented evidence consists of screenshots that they may fix? Who are you trying to pull on your side? Here's hoping they change it soon.
[ link to this | view in chronology ]
http://www.google.com/search?q=site:news.yahoo.com+%22Hamas+is+now+in+control+of+the+Gaza+Stri p+after+winning+an+election+there+against+Abbas%27+Palestinian+Authority,%22&hl=en&safe=stri ct&start=0&sa=N
http://www.google.com/search?hl=en&safe=strict&q=site%3Anews.yaho o.com+%22It+is+a+shame+Oboma%27s+top+lawyer+had+to+leave+his+job+with+Obama+to+defend+Goldman%2C+%22 &btnG=Search&aq=f&aqi=&aql=&oq=&gs_rfai=
http://www.google.com/search?h l=en&safe=strict&q=site%3Anews.yahoo.com+%22BP.+Bad+track+record+for+12+or+so+years+at+least .+Explosions+in+the+Houston+metropolitan+area%2C+leaks%2C+now+this.+Can%27t+keep+up+with+damage+cont rol.+Tragic.%22&aq=f&aqi=&aql=&oq=&gs_rfai=
[ link to this | view in chronology ]
Incompetence is a strong word
As the single programmer for several small projects, I make "assumptions" about what the client MIGHT want, all the time. If fact, it's usually the only way I can stay even current with what clients are asking for; that is to try to guess at what features they will ask for later on. Thus while I'm in the code (which getting to the point of actual coding can be the most time consuming piece of an application) I try to add what is asked for, and I also try to stub in as much "nice to haves" or "guess they will ask for" as possible. Thus when the client inevitably asks for such functionality I can just flip a switch and the functionality is there.
Granted, an experienced programmer codes in "switches" where a less experienced programmer [or a rushed programmer] might just make the functionality available without the client asking for it; but I would say this was more "exuberance" than "Incompetence".
I can not comment whether this specific issue is something nefarious, incompetence, or just programmer exuberance; but when individuals "competence" is on the line, I would urge you to lean towards giving them the benefit of the doubt and at least ALSO point out other alternatives than just "incompetence".
[ link to this | view in chronology ]
Re: Incompetence is a strong word
And usually, you'll have to recode something for [blank] cause the nth time you coded it to their EXACT specifications, it turns out they didn't know what they wanted that time either.
Dealing with user specifications sucks--why can't people learn to accurately specify exactly what they want?
[ link to this | view in chronology ]
Re: Re: Incompetence is a strong word
Judging by your words, it's probably because you never asked them to do it in the first place. When you receive a specification that's unclear to you, send it back for clarification, instead of working with what you THINK is what the user want.
When you stop making assumptions about what you saw, the users will stop assuming they have created a clear specification.
[ link to this | view in chronology ]
Re: Incompetence is a strong word
Thus, I, personally, find it very difficult to give the "benefit of the doubt" in this case, though I cannot directly say that it's the programmer's fault. It could just as easily be the project manager, or full blown management incompetence within Yahoo!, or it could be that somewhere down the line, they are trying to "pad the numbers" of their active users with some slight-of-hand (mainly for advertising).
[ link to this | view in chronology ]
Re: Re: Incompetence is a strong word
Either way, it is incompetence or fraud on someone's part at Yahoo and needs to change.
[ link to this | view in chronology ]
Re: Incompetence is a strong word
[ link to this | view in chronology ]
Re: Incompetence is a strong word
Fair enough... though, I can't see how Yahoo wouldn't notice this broken functionality and let it live on for so long.
[ link to this | view in chronology ]
Re: Incompetence is a strong word
http://www.repeater-builder.com/humor/what-the-customer-actually-wanted.jpg
[ link to this | view in chronology ]
Re: Incompetence is a strong word
If there had been any thought behind this, why wouldn't they at least have it default to "Newest comment first"? No sane person wants to read a 6 month old comment that was for some other article. The first thing I do when I read one of these articles is click "Newest". Next time I read an article there, I have to click "Newest" again, ie., it isn't sticky. Once again a sign of incompetence.
Why not just call a duck a duck?
[ link to this | view in chronology ]
Re: Re: Incompetence is a strong word
Just like this discussion thread -- I read them in order, and read the messages, then replies to the messages.
[ link to this | view in chronology ]
Re: Incompetence is a strong word
What he does is suggest the performance of the entire company indicates incompetence.
It is Yahoo, not Joe Programmer that gets the blame. Why would you read it any other way...was some individual named in Masnick's post?
[ link to this | view in chronology ]
Re: Incompetence is a strong word
[ link to this | view in chronology ]
As a programmer, I disagree. This is out and out bad design.
Comments should be linked to a unique internal ID for a page/report/story, regardless of the URL - the URL is an end user concern, not a programmatical identifier. When the URL changes for a page/report/story, the comments, linked via a unique, internal ID, move with the page.
To use a URL in system as a unique ID for a page/report/story, when URLs are a flexible/changable system, is pure bad design.
Sorry about all the slashes/lines/clarifications... ;)
I have this in a system I use (but didn't design). Instead of incremental arbitrary database ID's, some nugget decided to use the concept of "page ID's" - page ID's are strings, and are effectively names and identifiers for the users. Everything is linked to each other via page ID's. When page ID's change - EVERYTHING that links to it has to be updated. It's very, very naive database design.
[ link to this | view in chronology ]
Do you... uh... Yahoo?
It's bizarre that Yahoo would do this, but apparently, that's how Yahoo rolls.
Well, what did you expect? They're yahoos! ;-)
[ link to this | view in chronology ]
Doesn't Rule Out "Malice"
[ link to this | view in chronology ]
Much is bantered about concerning who is a "journalist" in this new age where the internet can so easily supplant hitherto conventional journalism. As I skimmed over the article I was struck by a level of detail and research that in my opinion supports the proposition that the concept of "journalism" should be viewed in a much broader sense.
Clearly, investigative journalism is being shown to transcend the "Woodard and Bernstein" model of old.
[ link to this | view in chronology ]
How AP works
Our newsroom constantly struggled with how to deal with this, however, because for every update that was 99% the same, there were others that were only 50% or 20% or even completely different. Suddenly, a story would go from being about a rocket attack to being about a diplomat visiting the area, and we'd lose the previous story. Photos and audio were slugged the same way, so we constantly had to try to keep up with every breaking story to make sure any side content was actually still relevant to the story. I cannot tell you how many times we'd have photos that were radically inappropriate for what the story evolved into.
I cannot speak to the competence of the programmers at Yahoo! (or at my old job), but I know it was endlessly frustrating for those of us who wanted to provide information that was relevant, up-to-date, and accurate.
[ link to this | view in chronology ]
Re: How AP works
[ link to this | view in chronology ]
Re: Re: How AP works
[ link to this | view in chronology ]
Re: Re: How AP works
In the end, they laid off the entire newsroom and let everything automate, so it's clear where the company's priorities were.
[ link to this | view in chronology ]
CBC does this also
Here on Techdirt, the authors generally reply to comments that offer corrections (even on typos) and tag the articles as being updated, but that is fairly rare among mainstream news sites.
[ link to this | view in chronology ]
Re: CBC does this also
[ link to this | view in chronology ]
Re: CBC does this also
[ link to this | view in chronology ]
Re: Re: CBC does this also
Things like small font italics at the bottom don't count.
Besides your point about refreshing the page is stupid. You see, if the page weren't refreshed, then the user's comments noting an error would still appear correct, and nobody would say "read the article dummy".
[ link to this | view in chronology ]
[ link to this | view in chronology ]
It nice to be optimistic
"I'm going to guess that this is more typical (embarrassing) incompetence on the part of Yahoo, rather than malice."
AP omits sign had Osama w/Bush on puppet strings
http://www.youtube.com/watch?v=8r823xx71uc
BBC does it too
BBC CENSORS Benazir Bhutto AFTER HER DEATH
http://www.youtube.com/watch?v=rctRdq4rB30
They even get to go back on TV and lie about what they said while reporting on a live event as it happened:
Traitor to the USA
http://www.youtube.com/watch?v=3zcczeuu7iA
Does not surprise me one bit.
[ link to this | view in chronology ]
great work
[ link to this | view in chronology ]
A corollary to Napolean's "Never attribute to malice that which can be explained by incompetence."
[ link to this | view in chronology ]
Re:
It's "Napoleon" for chrissakes. Putain de merde.
http://en.wikipedia.org/wiki/Napoleon_I
[ link to this | view in chronology ]
URLS not identical
http://news.yahoo.com/s/ap/ml_israel_palestinian
http://news.yahoo.com/s/ap/20100506/ap_on _re_mi_ea/ml_israel_palestinians
[ link to this | view in chronology ]
All you programmers out there
(Its rhetorical, no need to answer.)
No testing was done? No one checked? Incompetence.
[ link to this | view in chronology ]
yah, phony comments
[ link to this | view in chronology ]
Don't rule out a mix of motives.
[ link to this | view in chronology ]
Post updated with AP response
[ link to this | view in chronology ]
Re: Post updated with AP response
Then again, Yahoo might actually take responsibility for its misteps, kinda like Google, whereas the AP will never accept responsibility for what it does wrong. The mainstream media is never wrong on anything, they can't be.
[ link to this | view in chronology ]
Standard fair with digitial papers
[ link to this | view in chronology ]
Because the AP fails to provide a unique identifier for an article that can be correlated to all revisions of that article, you get stupid implementations like the one Yahoo! was likely forced into using to prevent duplicates article posts from every spelling mistake, title write-through, etc. It's easy for the AP to claim Yahoo! failed here - but the reality is that the AP's feed format is for wire purposes and not for updating web pages.
The only recommendation to Yahoo! I might have is force a cut off of the slug as a unique ID by scoping updates to only 2 day windows. Consider it a new story past 2 days. Then point the finger back at the AP and say "fail".
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
business long distance service
[ link to this | view in chronology ]
Yahoo's technical incompetence goes beyond this
You can look at the comment guidelines, including word limits, and make 100 comments in a day that fit within those guidelines, and maybe 50 of them will actually post, and 25 of them will actually be browsable later from your Yahoo profile. Moreover, some that post will disappear after a refresh, reappear, then disappear. Maybe this is some sort of "eventual consistency" caching architecture, but it just looks like crap.
Reddit has a much more usable commenting model as one example.
This and other issues in the past just convince me that Yahoo's developers are not the best devs in the world. Google regularly impresses me more just for their technical prowess.
[ link to this | view in chronology ]
www.tvdigitalnopc.com.br
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Just go and have a look.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
I believe censorship is going on. Spread the word!
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Yahoo!
Comments are available when I come through the backdoor using something other then IE giving me at least on comment before I'm blocked again.
Is speech on the Internet entitled to as much protection as speech in more traditional media?
Yes,
the U. S. Supreme Court ruled in Reno v. ACLU (1997) that speech on the Internet receives the highest level of First Amendment protection. The Supreme Court explained that “our cases provide no basis for qualifying the level of First Amendment scrutiny that should be applied to this medium.”
[ link to this | view in chronology ]
posting comments on yahoo on bottom of articles
[ link to this | view in chronology ]
[ link to this | view in chronology ]
http://www.pokemon2.org
[ link to this | view in chronology ]
http://www.pokemon2.org
[ link to this | view in chronology ]
Yahoo Comments
[ link to this | view in chronology ]
Was actually a scripting error
Yahoo has an anti-spam mechanism that will auto-block comments if X number are made in X time, if the same exact text is posted more than X times by the same person, or if most URLs are included.
The situation with comments appearing was a bug in the comment daemon. It was fixed.
If you don't want to see Yahoo News comments at all, and you have Adblock Plus for either Chrome or Firefox, make a filter for:
*news.yahoo.com*
This will NOT block the entire Yahoo News website as one would guess. As far as I can tell, it only blocks the comments. They'll appear as "Loading" and will never load. everything else works fine.
[ link to this | view in chronology ]
Yahoo and amazon blocking my view
They both has stopped my commented on on going daily affairs!! They are very democratic supporters so it seems! And when a replubican makes a view they block it by pushing your statement into another area which you can not reply to!! All you can do is start over again and the same damn thing happens!!! Help
[ link to this | view in chronology ]