stories filed under: "data collection"

Why Google's Street View WiFi Data Collection Was Almost Certainly An Accident

from the technical-details dept

Wed, Jun 23rd 2010 10:30am — Mike Masnick

We've been among those who have believed that Google's collection of WiFi data via its Street View cars was likely an accident -- but some have argued that it is impossible to do such a thing by accident. In fact, in the various lawsuits and legal maneuverings around this mess, many people keep claiming that there's simply no way Google was accidentally collecting this data -- although we've yet to hear a single person explain what Google would possibly want with the data, or seen a single shred of evidence that anything was ever done with the data. However, for those who insist it is impossible to for this to have happened by accident, Slashdot points us to a detailed technical analysis of why it almost certainly was an accident, despite all the claims to the contrary.

It explains, in great detail, how and why the collection of data packets would occur, mainly to help triangulate where the WiFi network was located -- something that Google has always admitted to doing. The problem was that some of the junk data (a very tiny amount, again, as explained in the article) got caught and retained, when it should have been dumped:

Although some people are suspicious of their explanation, Google is almost certainly telling the truth when it claims it was an accident. The technology for WiFi scanning means it's easy to inadvertently capture too much information, and be unaware of it.

It then goes on to show how all of this works, using a specific example from within a Panera Bread restaurant that has open WiFi, which the author uses to demonstrate just how easy it is to capture stray data, why it would make sense and also just how useless most of that data really would be. It's pretty convincing, but I doubt it will satisfy the conspiracy theorists who are just absolutely positive Google had something nefarious planned.

The key issue, as has been pointed out repeatedly, is that most people arguing nefarious intent don't seem to understand what Google was actually doing. It was trying to map the location of WiFi base-stations, a perfectly legal activity that a small group of companies have been doing for years. But in order to best figure out the location of the networks, it's helpful to have as much data as possible that traversing over the access point. The system doesn't care or need to know what that data is, it just wants as much data as possible for the purpose of triangulating. The problem was that Google's system "kept" the data that it got, even though there's been no evidence presented that the the data was ever used for anything (a key point that those screaming "criminal intent" repeatedly gloss over). On top of that, no one even explains why Google would want such data. The little snippets would be so random it's difficult to come up with any reason why keeping such data would be useful.

Triangulation is a lot harder than you'd think. This is because many things will block or reflect the signal. Therefore, as the car drives buy, it wants to get every single packet transmitted by the access-point in order to figure out its location. Curiously, with all that data, Google can probably also figure out the structure of the building, by finding things like support columns that obstruct the signal.

What's important about this packet is that Google only cares about the MAC addresses found in the header, and the signal strength, but doesn't care about the payload. If you look further down in the payload [in the example data from an open WiFi network in Panera], you'll notice that it's inadvertently captured a URL.

Take a look again. Even though the access-point MAC address is highlighted, there's extra data in the packet. These extra data will include URLs, fragments of data returned from websites (like images), the occasional password, cookies, fragments of e-mails, and so on. However, the quantity of this information will be low compared to the total number of packets sniffed by Google.

That's the core of this problem. Google sniffed packets, only caring about MAC addresses and SSIDs, but when somebody did an audit, they found that the captured packets occasionally contained more data, such as URLs and e-mail fragments.

I agree with the conclusion to the post. Just because this was pretty clearly an accident, it still doesn't make it a good thing. Google clearly should have realized this much earlier and never allowed such data to be captured. But those running around screaming about how this was all pre-meditated by Google are going to have to offer up a lot more evidence.

Filed Under: data collection, triangulation, wifi

123 Comments

As Google Hands Over Collected WiFi Data In Germany, France And Spain, Ireland Tells Google To Destroy It

Privacy

from the mixed-messages dept

Fri, Jun 4th 2010 4:56pm — Mike Masnick

We recently covered how the data that Google collected via its Street View WiFi efforts was caught in this weird legal limbo, between privacy laws, data retention laws and rules about destroying evidence. However, it looks like that's getting settled... but in very different ways in different countries. Somehow, Google has worked out a way to hand over the data in Germany, France and Spain... but over in Ireland, Google has been ordered to destroy the data. Not quite sure how this squares with the privacy laws in Germany, France and Spain... but hopefully we'll find out that the data collected by this system was mostly meaningless and we can get over the hype surrounding this whole thing.

Filed Under: data collection, data retention, europe, france, germany, ireland, privacy, spain, wifi
Companies: google

9 Comments

Senate Guts Broadband Data Bill

Politics

from the not-so-special-after-all dept

Fri, Oct 10th 2008 1:53pm — Mike Masnick

You may have heard recently about the new Broadband Data Improvement Act, passing through Congress, as it basically put into law what the FCC had already decided: the cutoff for what should be considered broadband needed to be raised, and the data collection methods for broadband penetration needed to be updated, from the clearly bogus methodology it currently uses.

Sounds good, right?

Except, as Broadband Reports lets us know, in moving from the House to the Senate, some Senators took the opportunity to gut the bill of most of its important parts. That is, it took away all funding for the FCC to actually measure broadband penetration in the US and took away the mandate to create a broadband penetration mapping solution. In other words, the Broadband Data Improvement Act has removed the ability for the FCC to improve broadband data.

Now, you could argue that the FCC shouldn't be wasting money on measuring this sort of stuff, but if you happen to believe that broadband is critical infrastructure these days, and an enabler of many other industries that drive economic growth, you can make a reasonable argument for why the government should have accurate data on broadband penetration, to make sure that we're not falling too far behind other countries.

Filed Under: broadband, broadband penetration, data collection, fcc
Companies: fcc

5 Comments

<< Newer Stories

Follow Techdirt

Essential Reading

The Techdirt Greenhouse

Read the latest posts:

read all »

Techdirt Deals

Report this ad | Hide Techdirt ads

Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Older Stuff

Tuesday
06:22	15 Years Late, The FCC Cracks Down On Broadband Apartment Monopolies (31)
Sunday
12:05	Funniest/Most Insightful Comments Of The Week At Techdirt (11)
Saturday
12:00	This Week In Techdirt History: February 13th - 19th (1)
Friday
19:39	Letter From High-Ranking FBI Lawyer Tells Prosecutors How To Avoid Court Scrutiny Of Firearms Analysis Junk Science (25)
15:52	Nintendo Is Beginning To Look Like The Disney Of The Video Game Industry (44)
13:49	Seattle Public Radio Station Manages To Partially Brick Area Mazdas Using Nothing More Than Some Image Files (44)
12:13	Thankfully, Jay Inslee's Unconstitutional Bill To Criminalize Political Speech Dies In The Washington Senate (8)
10:52	How Our Convoluted Copyright Regime Explains Why Spotify Chose Joe Rogan Over Neil Young (136)
10:47	Daily Deal: The Complete Blocs Website Builder Bundle (0)
09:33	Arizona Prosecutor Who Brought Bogus Gang Charges Against Protesters Files Ridiculous Defamation Suit Against Her Boss (12)
05:10	FTC Promises To Play Hardball With Robocall-Enabling VOIP Providers (23)
Thursday
20:30	FOIA Lawsuit Featuring A DC Police Whistleblower Says PD Conspired To Screw Requesters It Didn't Like (7)
16:06	Senator Blumenthal: Dismissing All Critics Of EARN IT As 'Big Tech Lobbyists' Shows Your Unwillingness To Recognize The Massive Problems In Your Bill (14)
13:43	California Sheriff, US DOJ Sued For Seizures Of Cash Generated By Legal Pot Businesses (33)
12:05	The Josh Hawley Mug: It Makes Him An Asshole, But Shouldn't Make Him A Copyright Infringer (42)
10:48	Blackburn & Blumenthal Introduce Terrible, Dangerous Bill To Make Sure Children Are Constantly Surveilled Online (21)
10:43	Daily Deal: The GameCreators Mega Maker Pack Bundle (0)
09:37	Whatever Problem EARN IT Is Trying To Solve, It Doesn't (12)
06:32	Gift Of Sight Stolen As Medical Implant Company Implodes (41)
Wednesday
20:13	Auguste Rodin's Sculptures Are In The Public Domain; 3D Scans Of Them Should Be, Too (34)
15:32	Content Moderation Case Study: YouTube Doubles Down On Questionable 'graphic Content' Enforcement Before Reversing Course (2020) (9)
13:42	NASA Says 2nd Gen Starlink Satellites Will Cause Headaches For NASA, Hubble (9)
12:09	Alabama Speed Trap Town's PD Called Out On Its Bullshit By Nearby Sheriff, Limps On Without Most Of Its Officers (17)
10:45	Senator Klobuchar's Next Unconstitutional Speech Control Bill: The NUDGE Act (37)
10:40	Daily Deal: The Complete 2022 Microsoft Office Master Class Bundle (0)
09:33	Nonprofit Forced To Delete Thousands Of Court Documents Obtained With A Fee Waiver Because PACER Is Greedy And Stupid (19)
05:45	Judge And Jury Say Sarah Palin Failed To Prove 'Actual Malice' In Defamation Case Against The NY Times (41)
Tuesday
20:09	DC Comics Goes To UK High Court Over Trademark Granted To Unilever For 'Wonder Mum' (7)
15:57	Some Senators Are Freaking Out Because The White House Is Pitching Some Extremely Minor Police Reforms (26)
13:30	Techdirt Podcast Episode 311: EARN IT Is Still Bad (5)

Why Google's Street View WiFi Data Collection Was Almost Certainly An Accident

from the technical-details dept

As Google Hands Over Collected WiFi Data In Germany, France And Spain, Ireland Tells Google To Destroy It

from the mixed-messages dept

Senate Guts Broadband Data Bill

from the not-so-special-after-all dept

The Techdirt Greenhouse

Tuesday

Sunday

Saturday

Friday

Thursday

Wednesday

Tuesday

More

Tools & Services

Company

Contact

More

from the technical-details dept

from the mixed-messages dept

from the not-so-special-after-all dept

Techdirt Daily Newsletter

The Techdirt Greenhouse

Tools & Services

Company

Contact

More