DailyDirt: Big Data Isn't Necessarily Better
from the urls-we-dig-up dept
The old -- Garbage In, Garbage Out -- GIGO principle originated during the early days of computing, but it may be even more applicable today. With the explosion of data available that can be collected, there's a temptation to assume that analyses and meta-analyses can make sense of all that data and produce incredible insights. However, we should probably have some skepticism before we jump into the deep end of data and expect miraculous results.- Microsoft researchers report that they think they can diagnose internet users with pancreatic cancer -- just by analyzing a large number of search requests. It's not clear what can be done with this research since the data was anonymized (and therefore no one can be contacted), and if users know their searches are being monitored for serious health issues -- will they continue to search using search engines that might creepily diagnose them? [url]
- Google Flu Trends attempted to predict flu seasons based on people's searches, but it didn't end up doing such a great job. Future versions of this project could be better, but the initial success of predicting flu trends suffered acutely from "big data hubris" -- a treatable (hopefully) ailment. [url]
- A widely-told parable warning of the mistakes using neural nets tells how the US Army once tried to train software to detect camouflaged tanks from various images. The complex software didn't learn how to detect tanks at all, but instead focused on the clouds that the algorithms determined correlated well with tanks. Oops. Who wants to trust AI to make life-or-death decisions if we can't understand what the machines are thinking? [url]
Filed Under: ai, artificial intelligence, big data, big data hubris, cancer, ed fredkin, flu trends, gigo
Companies: google, microsoft