DailyDirt: Big Data Isn't Necessarily Better
from the urls-we-dig-up dept
The old -- Garbage In, Garbage Out -- GIGO principle originated during the early days of computing, but it may be even more applicable today. With the explosion of data available that can be collected, there's a temptation to assume that analyses and meta-analyses can make sense of all that data and produce incredible insights. However, we should probably have some skepticism before we jump into the deep end of data and expect miraculous results.- Microsoft researchers report that they think they can diagnose internet users with pancreatic cancer -- just by analyzing a large number of search requests. It's not clear what can be done with this research since the data was anonymized (and therefore no one can be contacted), and if users know their searches are being monitored for serious health issues -- will they continue to search using search engines that might creepily diagnose them? [url]
- Google Flu Trends attempted to predict flu seasons based on people's searches, but it didn't end up doing such a great job. Future versions of this project could be better, but the initial success of predicting flu trends suffered acutely from "big data hubris" -- a treatable (hopefully) ailment. [url]
- A widely-told parable warning of the mistakes using neural nets tells how the US Army once tried to train software to detect camouflaged tanks from various images. The complex software didn't learn how to detect tanks at all, but instead focused on the clouds that the algorithms determined correlated well with tanks. Oops. Who wants to trust AI to make life-or-death decisions if we can't understand what the machines are thinking? [url]
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: ai, artificial intelligence, big data, big data hubris, cancer, ed fredkin, flu trends, gigo
Companies: google, microsoft
Reader Comments
Subscribe: RSS
View by: Time | Thread
Big Data
Now this is all fine and well but there is another terrible, very terrible side effect to this that a lot of people do not consider. Just like in Science, a researcher will actually create a setup where the data actually points in the direction they want to be true to begin with. You just cannot breed unconscious bias out of a manger any more than you can a baby.
If Big Data does not show management what they want to hear then it's back to the drawing board, or worse, Big Data is only used to research topics they care about while ignoring subjects they "feel" have already been addressed, because they are afraid that big data might bite them in the ass and reveal just how terrible they are at making decisions... well just like very other fucking person on the planet, it's just their ego's will not allow much of it.
[ link to this | view in chronology ]
AI for Driverless Cars
I know of at least 10 different areas within 25 miles of my house where the GPS maps are HIGHLY incorrect, and are DETERMINED to have you 'turn here' (right into the lake, the river, or the condemned 5-story building).
Yeah, who 'ya gonna sue, when your driverless car drowns your family in the local lake? I'll bet there's an app for that, too (or a detractor that will be written into law, saying you can't sue the map company, the car company, or the electronics company for their untimely death).
[ link to this | view in chronology ]
Re: pancreatic cancer
"the researchers declined to offer specific details"
Of course. Why save lives when you can keep a proprietary algorithm proprietary?
[ link to this | view in chronology ]