Microsoft Highlights Why Google's 'Cheater' Accusations Ring Hollow

from the good-for-them dept

Fri, Feb 4th 2011 4:10am — Mike Masnick

We had a long discussion recently about Google's response to discovering that Microsoft used clickstream data from users to help improve the relevance of their own search. Microsoft's Yusuf Mehdi has now written up a much more detailed response from Microsoft's point of view, in which it again clarifies that contrary to Google's statements, Microsoft is not "copying" Google's search results, but merely using clickstream data as one of many (Microsoft says approximately 1,000) variables in improving search relevance. Microsoft does take one cheap shot: noting that, technically, the "honeypot" trick that Google used to uncover this certainly appears to be a form of "clickfraud." That is, it was a trick designed specifically to manipulate Bing's search results.

But the key point is made towards the end:

We have brought a number of things to market that we are very proud of -- our daily home page photos, infinite scroll in image search, great travel and shopping experiences, a new and more useful visual approach to search, and partnerships with key leaders like Facebook and Twitter. If you are keeping tabs, you will notice Google has "copied" a few of these. Whether they have done it well we leave to customers. But more importantly, we take no issue and are glad we could help move the industry to adopt some good ideas.

That's the point that I tried to make in the original post. History has shown that innovation occurs via competition, and part of that competition often involves competitors building on each other's work. A few months back, I wrote a review of the excellent book Copycats by Oded Shenkar, which makes this point very, very clear. Innovation happens when companies build on each other's work. But, what you learn is that it's not just about "copying," it's about all of the players learning, innovating and expanding the overall market. Just straight up copying rarely does enough to make a difference (in fact, we've discussed this problem in the form of cargo cult copying, where companies just copy some superficial aspect, and discover that it's meaningless). That's clearly not what Microsoft was doing here.

In the comments to our original post, someone made the comment, in defense of Google, by saying if what Microsoft did was okay, then couldn't he just go out and say "I've got a billion dollar search engine idea!" and then just copy Google's results. But, of course, if anyone actually thinks this through, they'd realize that copying Google's search results is not a billion dollar search idea. Assuming that, tomorrow, we launched a "new search engine" that gave the identical results to Google, almost no one would use it. Why would you? There's no real advantage to doing so. And for people who already use Google, it's probably much more integrated into their lives, with Gmail, Google Docs and more. The search results themselves are not the "billion dollar idea." It's the overall execution.

Hopefully Google learns from this and realizes that it has learned plenty from watching Microsoft as well, and complaining about Microsoft using clickstream data is a waste of time. Focus on continuing to innovate, Google, which'll probably mean learning more things from Microsoft, in addition to what you're doing yourself.

To be fair, Matt Cutts also has a put together a decent response, where he points out that the real issue here may be disclosure -- in that Microsoft did not clearly disclose that it was using clicskstream data (and especially how it was using that data). That's a perfectly reasonable point, but it was not the original point that Google raised. I agree that Microsoft could and should be much clearer in its disclosure -- but that's a totally separate issue. Cutts also explains why he thinks that Microsoft really is "copying," but again, even if we grant that premise (which I don't think is accurate), I still don't see why that matters. Copying and improving is a part of the innovative process. Google should embrace it.

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: bing, copying, innovation, search
Companies: google, microsoft

38 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

Shawn (profile), 4 Feb 2011 @ 4:24am

Google would have been much better served to take a snarky approach. Off the cuff comments in a blog that point out that Google's engine is so good that when other search engines cant find the results they learn from google, would have made the point and not come off accusatory.

Google does not really do PR that well on a bunch of things like this.
[ link to this | view in chronology ]
Anonymous Coward, 4 Feb 2011 @ 4:29am

I'm glad Bing uses Google, since Google blocks some searches done by proxies I'm redirected immediately to Bing :)
[ link to this | view in chronology ]
abc gum, 4 Feb 2011 @ 4:31am

Misrosoft crying foul - that's rich
[ link to this | view in chronology ]
- Marcus Carab (profile), 4 Feb 2011 @ 6:25am
  
  Re:
  Google cried foul - Microsoft just defended themselves. Quite eloquently I might add.
  [ link to this | view in chronology ]
DenisVi, 4 Feb 2011 @ 4:39am

This word you keep using, Google does not think it means what you think it means
Copying, as defined by Microsoft folks over and over again these past few days, is taking an innovation from company A and re-building it.

Copying, as defined by Google folks over and over again these past few days, is taking results of an innovation and literally copying them.

While one is common in the tech world, and Bing staff makes a valid point of creating multiple innovations that Google has "adopted", the second one is problematic. When Bing says that clickstream (with apparent tailoring for google links) is only one of the signal, it basically means that when the other signals aren't returning normal data, Bing relies *only* on Google. Moreover, it sometimes exploits Google's propriatery autocorrect mechanisms to increase it's relevance in that way. Meanwhile, Microsoft hasn't indicated that Bing is learning something from it, just parroting, and if Google was to disappear one day, Bing that relies on Google so much would be effectively crippled.
[ link to this | view in chronology ]
- vivaelamor (profile), 4 Feb 2011 @ 5:59am
  
  Re: This word you keep using, Google does not think it means what you think it means
  "Microsoft hasn't indicated that Bing is learning something from it, just parroting, and if Google was to disappear one day, Bing that relies on Google so much would be effectively crippled."
  
  Actually, they appear to record data from their Bing toolbar, which would still be in their system if Google were to disappear. Although their learning would stop if Google disappeared, what they have already learnt would still apply.
  [ link to this | view in chronology ]
  - cc (profile), 4 Feb 2011 @ 6:11am
    
    Re: Re: This word you keep using, Google does not think it means what you think it means
    Problem is, the internet is not a fixed document set. Yesterday's most relevant result could be today's least relevant result.
    
    Google has shown that they have the right heuristics to keep their mappings updated, but I'm not sure if Bing can work well enough without "borrowing" Google's...
    [ link to this | view in chronology ]
- cc (profile), 4 Feb 2011 @ 6:05am
  
  Re: This word you keep using, Google does not think it means what you think it means
  That's more or less what I was arguing in the comments of the previous article, even though my thoughts focused only on innovation in search quality and not on presentation.
  
  The result of my discussion with Marcus Carab can be summarised thus: It depends on how much Microsoft is (indirectly) using Google's results, and we won't know that until more data is made available.
  
  My hunch is, a lot. They must be getting massive amounts of Google data, seeing how many people use Google, and it's all in pure query->document format, no less. In my view, instead of coming up with a better way to analyse the data it already has, Bing is trying to replicate Google's existing semantic links* between terms, which is possibly the hardest thing to tweak when you're making advanced document retrieval systems.
  
  That they say they use "over 1000 variables" is irrelevant, because as any statistician will tell you it's not the number of variables that counts but their weighting. If Bing is aiming to "become Google" because that's the search engine people want, they'll use the query->document data they get from Google to directly reinforce the query->document mappings in their system, which makes the other sources mostly irrelevant...
  
  And that's why this is cheating, in my opinion. Perhaps that's not necessarily a "bad" thing and their technology will eventually and inevitably catch up, but it leaves a bad taste in my mouth all the same.
  
  * For instance, Google may have decided to use a thesaurus (or even automatically learned a thesaurus!) to create a link between the terms "cat" and "feline", so when a user searches for cats, they also get documents about felines. This is not an obvious link for a computer, but it very likely improves retrieval performance. If Bing didn't think to do the same, and they only start showing documents about felines because they saw Google do the same, then their technology is still inferior, so in my book this cannot possibly count as innovation or as science. They are giving the illusion that they are competing with Google, but they are simply giving a "counterfeited" version of their competitor's results that they couldn't recreate by their own means.
  [ link to this | view in chronology ]
  - cc (profile), 4 Feb 2011 @ 6:30am
    
    Re: Re: This word you keep using, Google does not think it means what you think it means
    And yes, "counterfeited" is a loaded word, but I can't think of a word that applies to this situation.
    
    It's not copying in the traditional sense, it's not counterfeiting and it's definitely not stealing.. "Cheating" and "plagiarism" are the only words that I can think of that sound harmless enough to describe this, but even they are overkill.
    [ link to this | view in chronology ]
    - Avatar28 (profile), 4 Feb 2011 @ 5:17pm
      
      Re: Re: Re: This word you keep using, Google does not think it means what you think it means
      Imitaion?
      
      In any case, I have to disagree. What you have described IS innovation. Take an idea that someone else had and improve on it. Based on your logic Google's image search is inferior to Bing's because MS had the idea for the infinitely scrolling search and then Google copied the idea.
      
      That's also not what I believe happened here. Rather, MS is looking at user behavior. User searches for a word or phrase in Google or any other search engine and then clicks on links A, B, and F (having decided that C, D, and E are just blog spam). When the search is done on Bing it takes into account that people were clicking on A, B, and F but only a few were clicking on C, D, and E and they didn't stay if they did. When it ranks the results C, D, and E are ranked lower as a result.
      
      Basically, it brings humans into the ranking process to provide more useful results. Digital computers are not nearly as good at recognizing patterns (and thus filtering out junk sites) as the human brain. In some ways, it is sort of like Yahoo did in its early days. Also bear in mind that even after Google engineers fed Bing lots of fake data and fake clickthroughs on nonsense words they still only managed to get Bing to show the site they wanted a like 6 times out of 100 attempts. In other words, using a bullshit scenario that would never happen in real life they were only able to trick Bing a whopping 6% of the time.
      [ link to this | view in chronology ]
      - cc (profile), 4 Feb 2011 @ 5:50pm
        
        Re: Re: Re: Re: This word you keep using, Google does not think it means what you think it means
        You missed my point, I think. Perhaps you want to read the conversation between me and Marcus Carab in the previous related article.
        
        "User searches for a word or phrase in Google or any other search engine and then clicks on links A, B, and F (having decided that C, D, and E are just blog spam)."
        
        But what miraculous process put relevant results in positions A, B and F? Google's algorithm, we can presume. If another search engine copies the results of the algorithm, it means they can fake improved search performance but don't know how it was actually done. They improve their search, but contribute nothing to the users or to search engine technology -- not innovation, in my opinion.
        
        "using a bullshit scenario that would never happen in real life they were only able to trick Bing a whopping 6% of the time."
        
        Which means Bing couldn't absorb all the data the 20 engineers were feeding it and nothing else. As to why, it's anybody's guess. My guesses are, it's either to keep the sparse document vectors smaller (by ignoring rarer terms) and thus cut costs, or maybe they were clever enough to have a safeguard so spammers can't exploit their exploit and Google-bomb them (literally) with fake/dangerous websites for common terms.
        [ link to this | view in chronology ]
Anonymous Coward, 4 Feb 2011 @ 4:51am

And let the IP nuclear warfare begin. How much do you bet Google is gonna go sue-happy?
[ link to this | view in chronology ]
- Nick Coghlan (profile), 4 Feb 2011 @ 5:05am
  
  Re:
  Why would they? They've already made the PR point that Google is good enough for Bing to use as a fallback when Bing's own results aren't getting anything.
  [ link to this | view in chronology ]
Ed, 4 Feb 2011 @ 5:17am

Bing (Powered by Google)
I haven't seen Google make any claims that what Bing has done is illegal or even immoral. All I have seen is Google pointing out that Bing is using Google. I don't see how Google pointing this out is wrong.

Much like a songwriter, after hearing his song played by a musician coming out and saying "I wrote that." Not trying to stop the musician, just interested in claiming the credit.

I haven't seen Google begin formal action, legal or otherwise. They only seem interested in pointing out the behavior to the press and embarrassing a competitor.
[ link to this | view in chronology ]
Meoip, 4 Feb 2011 @ 5:22am

Say what you want
Say what you want but Google bested Microsoft on this. Microsoft looks foolish and underhanded a reputation they are trying so hard to shed (and are having some success at). I mean that whole click fraud thing? seriously Google was searching on Google not Bing I'm not sure I can defraud my self.
[ link to this | view in chronology ]
- Anonymous Coward, 4 Feb 2011 @ 9:14am
  
  Re: Say what you want
  Microsoft's Bing toolbar customers send them the search terms they use and the resulting links they click on. In this case their customers were Google engineers that submitted their results to Microsoft to help them make search better.
  
  Google's engineers shouldn't be submitting their clickstream data to Microsoft if they don't want them to use it to better their search results.
  
  In this case Google engineers intentionally manipulated their search results (they said this is impossible during congressional hearings - obviously incorrect) and then intentionally and in an organized manner attempted to use clickstream data to influence Bing search results. That is one form of click fraud.
  
  I get it that you like Google and don't like Microsoft. However, these types of arguments don't make any actual sense.
  [ link to this | view in chronology ]
Zimzat (profile), 4 Feb 2011 @ 5:37am

Two headed hydra that can't agree
I feel that there is a two-headed hydra at Techdirt and neither head agrees with the other.

One head posts about how businesses need to learn to innovate, to compete, and accept that marketplace instead of falling back to legal protections.

The other says that when there is something wrong with a copy we should leave it up to social shunning to make it right.

And yet, despite the fact that Google has not, so far at least, fell back to legal protections, and is actually trying to leave it to social shunning, TechDirt posts are now trying to socially shun Google when they're the ones that were copied.

Make up your mind.
[ link to this | view in chronology ]
- vivaelamor (profile), 4 Feb 2011 @ 6:10am
  
  Re: Two headed hydra that can't agree
  "And yet, despite the fact that Google has not, so far at least, fell back to legal protections, and is actually trying to leave it to social shunning, TechDirt posts are now trying to socially shun Google when they're the ones that were copied."
  
  Calling something cheating isn't merely socially shunning, it's implying that they broke the rules. The fact that they haven't sued doesn't mean that what they've said isn't liable to backfire. It would appear that you would rather Techdirt supported Google making inaccurate statements than point out the truth.
  
  That said, Microsoft may have fucked up by how they gathered the data. However, in that case they didn't wrong Google or Google's customers, only their own customers.
  [ link to this | view in chronology ]
  - Zimzat (profile), 4 Feb 2011 @ 7:18am
    
    Re: Re: Two headed hydra that can't agree
    I don't want them to support Google (this doesn't have to be a one side or the other argument), but simply stop bashing Google as the main argument of their reporting.
    
    It's their opinion and they're allowed to it, though. This is my comment and I'm voicing my opinion. *shrug*
    [ link to this | view in chronology ]
    - Marcus Carab (profile), 4 Feb 2011 @ 8:48am
      
      Re: Re: Re: Two headed hydra that can't agree
      It's not about "bashing" Google, it's about pointing out a genuine disagreement with their reasons for bashing Microsoft.
      [ link to this | view in chronology ]
    - crade (profile), 4 Feb 2011 @ 10:14am
      
      Re: Re: Re: Two headed hydra that can't agree
      Ahh, but the point is that society gets to decide who is in the right and then support their choice. I don't know why you would assume everyone must always think the complaintant is in the right. If the copying is fair, maybe the complainer warrants some social shunning for trying to weasel out of fair competition.
      [ link to this | view in chronology ]
    - David Liu (profile), 4 Feb 2011 @ 3:48pm
      
      Re: Re: Re: Two headed hydra that can't agree
      Yeah, I don't quite get it either.
      
      It's not like Google really wasted a lot of time, money, and effort to catch Microsoft in the act, and once it did, it tossed up a blog post about it. Honestly to me, it seems like Google's doing exactly what Techdirt says it should, by socially shunning Microsoft for "cheating". Maybe it should've done it a little more snarky to come out quite a bit more ahead, but still, from what I've read, it's good enough.
      
      I don't quite get why Mike says "Google complaining about Microsoft using clickstream data is a waste of time". It isn't. It puts Google in the better light socially, exactly what Mike has set forth in the past. It's been really hard to read these Google vs Bing articles in the past couple days, since it's a glaring hypocrisy in every one of the articles.
      [ link to this | view in chronology ]
      - vivaelamor (profile), 5 Feb 2011 @ 9:27am
        
        Re: Re: Re: Re: Two headed hydra that can't agree
        "Honestly to me, it seems like Google's doing exactly what Techdirt says it should, by socially shunning Microsoft for "cheating". Maybe it should've done it a little more snarky to come out quite a bit more ahead, but still, from what I've read, it's good enough."
        
        The point is that Google are implying that Microsoft wronged them by cheating, which they technically don't appear to have and thus stand to generate more bad publicity for crying wolf than if they'd just left out the accusation of cheating. Flaming Mike for saying so is OK. Flaming Mike for saying so and suggesting that he is somehow going against his opinions on shaming actual wrongdoings when he doesn't think this is an actual wrongdoing is not OK.
        [ link to this | view in chronology ]
aikiwolfie (profile), 4 Feb 2011 @ 5:41am

It's just an excuse to kick Microsoft when it's down!
This whole "Microsoft copied our search results" attack from Google is an excuse to kick Microsoft when it's down. I'd agree Google's time would probably be better spent doing more creative things.

t makes sense Google would try to create bad news stories for Microsoft. This is a small piece of propaganda in a much bigger spat.

Microsoft is involved in several law suits against Android. Which Google can't be happy about. And at the moment Microsoft looks particularly stale and week.

Microsoft continues losing money in the search space, it continues to cut projects and product lines, it continues to lay off staffers and it's also still losing top managers. And for the first time in a long time Windows desktop market share is threatening to drop below 90%.

And the recent financial results from Microsoft didn't look good. Even after they tried to explain them away with their own special brand of accounting. The share price for Microsoft stock still fell.

Microsoft are trying to hurt Google at the moment. And Google smells blood. But not it's own.
[ link to this | view in chronology ]
Ted Burner, 4 Feb 2011 @ 5:44am

Google and Microsoft Will Settle
I won�t worry about these titans. Google and Microsoft are very powerful and they are fighting for territory. I just hope that they make the world wide web a better place.
[ link to this | view in chronology ]
Overcast (profile), 4 Feb 2011 @ 6:27am

Innovation happens when companies build on each other's work.

It has to - it's not like Google wrote the software their services *need* to operate. Microsoft didn't design the CPU that's needed for their OS to operate..

If all innovation in an area was left to a single lateral patent/copyright - we'd still be riding horses if we couldn't afford the buggy from the single producer.

Of course, Microsoft has a long history of just hi-jacking other people's innovations and then boxing them with other software in a vain attempt to make it look like it's 'original'.
[ link to this | view in chronology ]
Anonymous Coward, 4 Feb 2011 @ 7:59am

Hitting Microsoft because it isn't transparent about its search is kind of a joke, don't they all hide how they do this?

I am also sure Microsoft would be perfectly happy if Google just went away, even if it hurt Bing's abilities.

I don't have a problem with Microsoft using Google info to improve their search, I don't have a problem with companies looking at products on the market and improving them. What I do believe is wrong is flat out copying content. That is what most musicians and artists have a problem with. Its not taking something they have done and redoing it, it is taking a song and just because its digital, thinking there is a right to distribute it.
[ link to this | view in chronology ]
Xander C (profile), 4 Feb 2011 @ 8:00am

Missing Backstory
Mike, for your concideration:
http://www.npr.org/2011/02/02/133443201/Google-Bing-Tussle-Over-Search

The "Search Rip-off" came about as Google's lead engineers started noticing Bing's searches on misspelled words where getting identical fixes and results.

"LAURA SYDELL: When you type a search request into Google, say, Hosni Mubarak, and you're a couple of letters off, Google can usually figure out what you mean.

Mr. AMIT SINGHAL (Software Engineer, Google): And getting these queries right is an incredibly hard task. It's a very challenging algorithm.

SYDELL: That's Amit Singhal. He's the lead of the search team at Google. A few months back, they noticed something strange. A user searched for tarsorrhaphy.

Mr. SINGHAL: It was this real medical procedure that some users generally needed to know about.

SYDELL: The user misspelled it. But Google's algorithms figured out what he needed. Singhal noticed that competitor Bing didn't bring up any results until a few weeks later.

Mr. SINGHAL: Bing started showing the topmost relevant result for that spelling correction to their users.

SYDELL: Hmm.

Mr. SINGHAL: Now, we got suspicious. However, we said, maybe they came up with some clever algorithm and they did it.

SYDELL: But Singhal and his team decided to do a little experiment. They began to do searches for silly made-up words, and they created fake results unrelated to those words. A few weeks later...

Mr. SINGHAL: Microsoft's Bing started showing the same artificial result for the same synthetic query. And this was just conclusive to us at that point."

While Bing has offered great things to Searching, there was clearly a copy of services that could not be explained by just creating their own proper code. As noted, Bing was "learning" from people using Google though IE 7/8, sending over data as to what was being searched and what Google returned with for those queries. That's a level of shady we've come to expect from MS and needs to be called out.
[ link to this | view in chronology ]
- Mike Masnick (profile), 4 Feb 2011 @ 11:09am
  
  Re: Missing Backstory
  Mike, for your concideration:
  http://www.npr.org/2011/02/02/133443201/Google-Bing-Tussle-Over-Search
  
  All of that was explained in the original story. Not sure what's new in there? Still not sure why there's a problem there.
  [ link to this | view in chronology ]
  - Anonymous Coward, 4 Feb 2011 @ 11:54am
    
    Re: Re: Missing Backstory
    Mike, what will you consider to be wrong? Please give an example?
    
    Say, if Bing returns the same Google page but with Bing logo, and Bing ads, etc. Still ok? Can we call this Bing-google-it-for-you innovation?
    [ link to this | view in chronology ]
    - Mike Masnick (profile), 4 Feb 2011 @ 10:44pm
      
      Re: Re: Re: Missing Backstory
      Mike, what will you consider to be wrong? Please give an example?
      
      I would consider something to be wrong if I said something that was factually incorrect. I haven't seen that in this story yet. There was nothing in that backstory that said anything I had said originally was wrong.
      
      Say, if Bing returns the same Google page but with Bing logo, and Bing ads, etc. Still ok? Can we call this Bing-google-it-for-you innovation?
      
      What do you mean by "ok"?
      [ link to this | view in chronology ]
LT BALL, 4 Feb 2011 @ 8:04am

Beware.. who you fear
Russians were convinced the Czar was bad for them. They got the murdering regimes of Lenin and Stalin.
Microsoft is big and pushy... Google is worse they are arogant and snoopy with no interest in privacy rights etc. Google complaining someone else is looking at thier pubilc data is rich after they have been caught looking data on wifi networks.
[ link to this | view in chronology ]
- crade (profile), 4 Feb 2011 @ 11:03am
  
  Re: Beware.. who you fear
  the Czar was bad for them.
  [ link to this | view in chronology ]
Aaron Ortiz, 4 Feb 2011 @ 8:29am

Marketing ploy
I think Google simply wanted everyone to know that Bing was copying, to discourage people from using it, and prevent the sort of wild claims Microsoft is fond of making.
[ link to this | view in chronology ]
aikiwolfie (profile), 4 Feb 2011 @ 8:53am

Are other search engines affected?
Something that seems to have been over looked is the possibility that Microsoft may be doing this to other search engines and the possibility other search engines using installable toolbars might just be doing the same thing to their competitors as well.
[ link to this | view in chronology ]
crade (profile), 4 Feb 2011 @ 10:03am

This would be more interesting if I believed bing was ever going to be a real competitor.
[ link to this | view in chronology ]
Anonymous Coward, 4 Feb 2011 @ 11:13pm

I generally agree with 99.9% of what Mike says, but this is one of the exceptions.

Taking search result X from Google that happens in a Microsoft browser as a result of query Y, and then replicating it with their own search page certainly meets the definition of "copying", and I personally really struggle to see it as "innovation".
[ link to this | view in chronology ]
Hardik Upadhyay (profile), 15 Feb 2011 @ 2:08am

Multiple connections linking to ONE base line
There are multiple references of copying by both the parties. But the baseline is that they both want more of the market share in search engine pie.

Now once they are on each other, they will surely try to co-innovate by this kind of search.

In the end, WE are going to get the better product.
[ link to this | view in chronology ]