Reporters Find Exposed Personal Data Via Google, Threatened With CFAA Charges
from the sounds-familiar dept
In a story that sounds mighty similar to the Andrew "weev" Aurenheimer situation, two reporters from the Scripps News service have been told that they may be hit with Computer Fraud and Abuse Act (CFAA) charges after a Google search they did turned up personal data on 170,000 customers that two telcos left exposed. At issue are low-income customers of YourTel and TerraCom, who provide service for the FCC's Lifeline, a phone service for people who are enrolled in state or federal assistance programs. Apparently, the real issue was a company called Vcare, which the two telcos outsourced certain services to. The Scripps reporters noted that they did nothing more than a Google search:The unprotected TerraCom and YourTel records came to light through the simplest of tools: a reporter’s Google search of TerraCom.Of course, rather than be thankful to the reporters for letting them know about a huge security lapse, or be apologetic for revealing all sorts of key data on their customers, they decided to sue.
The records include 44,000 application or certification forms and 127,000 supporting documents or “proof” files, such as scans or photos of food-stamp cards, driver’s licenses, tax records, U.S. and foreign passports, pay stubs and parole letters. Taken together, the records expose residents of at least 26 states.
The application records, drawn from 18 of those states and generally dated from last September through November, list potential customers’ names, signatures, birth dates, home addresses and partial or full Social Security numbers. The proof files, from last September through April, include residents of at least eight remaining states.
However, Vcare and the two telecom companies assert that the reporters "hacked" their way into the data using "automated" methods to access the data. And what was this malicious hacking tool that penetrated the security of Vcare's servers? In a letter sent to Scripps News by Jonathan D. Lee, counsel for both of the cell carriers, Lee said that Vcare's research had shown that the reporters were "using the 'Wget' program to search for and download the Companies' confidential data." GNU Wget is a free and open source tool used for batch downloads over HTTP and FTP. Lee claimed Vcare's investigation found the files were bulk-downloaded via two Scripps IP addresses.I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA. However, as mentioned, the story does have similarities to the weev case -- except this time we're talking about reporters for a well known news service, rather than someone with a reputation as an internet troll. Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: cfaa, exposed data, hacking, security vulnerability
Companies: scripps, terracom, vcare, yourtel
Reader Comments
The First Word
“So now I'm a criminal because I didn't ask my browser to get it?
Subscribe: RSS
View by: Time | Thread
wget
[ link to this | view in chronology ]
Re: wget
[ link to this | view in chronology ]
Re: wget
[ link to this | view in chronology ]
Computer Fraud and Abuse
[ link to this | view in chronology ]
Re: Computer Fraud and Abuse
[ link to this | view in chronology ]
Re: Re: Computer Fraud and Abuse
[ link to this | view in chronology ]
Re: Re: Computer Fraud and Abuse
Vcare seems to be one the crying foul, not Google.
[ link to this | view in chronology ]
Re: Re: Re: Computer Fraud and Abuse
[ link to this | view in chronology ]
Re: Re: Re: Computer Fraud and Abuse
"While the reporters claim to have discovered the data with a simple Google search, the firms' lawyer claims they used "automated" means..."
Referencing the company Google and then the telco firm without the name can be confusing. High school level reading skills are required to properly understand that without having to read it two or three times.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Pure Half-Assed CYA ...
Because if the telcos did not claim the reporters hacked the information, then they are tacitly admitting they posted the personal info of 100+k people openly online. And that's a pretty big oops.
[ link to this | view in chronology ]
Re: Pure Half-Assed CYA ...
[ link to this | view in chronology ]
Re: Pure Half-Assed CYA ...
"Typical usage of GNU Wget consists of invoking it from the command line"
We all know that anyone using the command line instead of a GUI is a dirty hacker.
/s
[ link to this | view in chronology ]
Re: Re: Pure Half-Assed CYA ...
[ link to this | view in chronology ]
Isn't there something missing?
For right now, I'm going to hold off judgement on the possible actions against the reporters, but shouldn't there be an extra line to this article? Something along the lines of "TerraCom and YourTel are under investigation for gross negligence."
[ link to this | view in chronology ]
So here's the situation as I (and likely everyone else, except the asshats who are trying to cover up their gross negligence and extreme incompetence) sees it:
1. Company collects data
2. Company is required NOT to hold on to data
3. Company holds on to data in violation of #2
4. Company further shows how retarded they are by making it publicly accessilbe
5. Someone finds publicly accessible documents (that shouldn't exist in the first place) and lets the retards know
6. Retards sue, crying "Hacking! Hacking!"
What an amazing strategy!
Anyone else just wondering how long it's going to be before someone figures this out for what it really is?
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Just this morning in a cursory review of a prospective audit client's online presence, I did a Google search and discovered over 1,000 PDFs of customer invoices they blocked via robots.txt file but since Google now includes URLs of robots blocked files and slaps a "description not available due to robots instruction" that shit is wide open to anyone on the web, no hacking needed.
Companies need to be held accountable for their massive security failings and Google needs to be held accountable as well, even though that shit should have been completely blocked and behind a secure firewall.
The fact that this situation involved a couple reporters gives me little comfort in the notion that asshat companies might eventually be held accountable for causing such massive failings.
We need a comprehensive overhaul of the system, one NOT determined by congress or lobbyists. One that severely penalizes the asshats that cause the problem and rewards the ones who expose it.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
Neither Google nor wget magically know what files are available on a site. For the files to show up in a recursive wget (or to the Google spider, for that matter), it means that the files had to be actively linked to from other accessible pages on the site. Somewhere on their public-facing website, there was a link pointing to all those thousands of insecure, confidential documents that the company wasn't even supposed to be keeping in the first place.
[ link to this | view in chronology ]
Don't go out of your way to write a scraper program!
2nd point: "firms' lawyer claims" is all I see of the "threat", and yet Mike implies charges are imminent. So far this is just another of his panics.
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
OOTB you are the personification of troll=stupidity
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
your second point: why would someone imply that someone else is liable to be sued if not to threaten to sue? even if said threat is empty?
Also, I don't think the article implies charges are imminent. It's implied that if the companies decide to sue it will be laughed out of court, so it's unlikely that they actually will.
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
better idea: use some fucking common sense and realize wget is not inherently bad
[ link to this | view in chronology ]
Re: Don't go out of your way to write a scraper program!
Makes perfect sense to let someone in an eastern bloc country find it and "report" it instead via some IRC room.
("Report" in the above sentence means "Sell for profit")
You fucking dumbass.
[ link to this | view in chronology ]
Going after the reports is basically the following happening.
-Person A compiles a list of over 100,000 customers and their personal data
-Person B gets hired by Person A to manage the data, and leaves the data lying out in the open where anyone can grab it.
-Person C finds the data lying around and grabs it and dumps it in a public area where anyone can still read it, but it's in a place with much more traffic.
-Person D finds the dumped personal data, reports it to Person A, and gets charged with hacking.
[ link to this | view in chronology ]
Re: gOOGllE dumps data into public searchability
[ link to this | view in chronology ]
Actually...
Also, as evidenced by the DoJ subpoenas of AP and FoxNews' phone records, the one surefire way to make sure "how abusive these laws are" gets into the goldfish-sized attention span of the professional news media is to use those laws against the professional news media.
[ link to this | view in chronology ]
Re: Actually...
Don't you watch The Daily Show man?
[ link to this | view in chronology ]
Re: Re: Actually...
[ link to this | view in chronology ]
You'd be pretty surprised what you can find by just playing around with them.
I blame boredom.
[ link to this | view in chronology ]
So now I'm a criminal because I didn't ask my browser to get it?
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Google apparently no longer uses its spider crawler to help show people the most relevancy possible when they search...Instead they do relevancy in the way most web advertisers do by what and how many things an individual clicks on and move those items up on said individual's search results based on how many times you click a link. They identify you via IP address and hold your search data for up to 9 months. Within the 9 months if you search on Google, they don't delete previous searches at all when the 9 month mark rolls around.
If Google's CEO has the power to internally check any user's e-mails without a password...imagine what these reporters who pointed out the security flaw could have done.
[ link to this | view in chronology ]
Re:
RTFA again and grasp who is at fault here for lax security.
[ link to this | view in chronology ]
Re:
WTF are you talking about, Wally?
Why would Google have anything to do with some other company's lack of security on the pages they have facing the web?
[ link to this | view in chronology ]
Re: Re:
Google claims to be on the side of security, yet they ignore the robots.txt file's disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as "disallow" in that file. As such, they are implicit in the breach.
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
Do you have a citation for that? It's not that I don't believe you - just haven't heard that one before.
[ link to this | view in chronology ]
Re: Re: Re: Re:
They do not show all URLs that are blocked in the robots file, however if their (extremely flawed) system sees enough "other indicators" to countermand the robots instruction, they ignore that instruction.
"Other indicators" is most often "a link to that file somewhere on the site itself or pointing to the URL from another site.
The "learn more" link points to this Google answer page where it states:
Which is complete bullshit. because while they're not actually indexing the CONTENT of the page, they're indexing the URL.
So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re:
So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.
You're not suggesting this is a security issue, are you? Because if you're relying on robots.txt to secure sensitive information, you're doing it very, very wrong. So what is the problem with this behavior by Google? I'm not saying there isn't one, I'm just not sure I'm even clear on why you would use robots.txt to keep search engines away. If it's to save bandwidth, then this doesn't cause a problem since Google isn't downloading the page.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Re:
Google is saying robots.txt is a security measure? If that's what you're saying, do you have a reference? If not, what do you mean by security backstop?
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Re: Re: Re:
But you keep mentioning security in connection with robots.txt. If you acknowledge that it's not a security measure, and Google doesn't say it's a security measure, why are you still talking about security?
Also, why is this a big deal? I'm not trying to defend Google, I just really don't see why it's important. Can you explain it?
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:
For example - they proactively block sites their system detects that have malware or viruses. They don't have to. Its the responsibility of site owners to ensure their sites don't have malware or viruses baked in. Yet Google has chosen to help.
This is no different.
[ link to this | view in chronology ]
Re: Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:
[ link to this | view in chronology ]
crticial thinking
[ link to this | view in chronology ]
Re: crticial thinking
So you're saying it was never intended as a security measure, it is not appropriate to rely on it for security, Google does not recommend it be so used, and Google should try to make it as effective a security tool as possible? No wonder I was confused. :-)
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
it's called innovation
Anything other than that understanding is called myopic thinking.
[ link to this | view in chronology ]
Re: it's called innovation
Yes, but if using it in that way is stupid and ineffective, then that doesn't provide value. Maybe the illusion of value, which is even worse than nothing.
[ link to this | view in chronology ]
Re:
You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don't want to give them access to the data you create you can feel free not to use their free services.
[ link to this | view in chronology ]
Re: Re:
3/4 of Google's revenu in 2010 came from advertising and less than 1/100 came from search technologies. When you do the math, you start to realuze Google's priorities have shifted towards catering to advertisers rather than web users.
[ link to this | view in chronology ]
Re: Re:
Although on second read...I recall that one could do this but only if one is logged onto Google+...which means they are still tracking your every move and can target ads to you with the advertising companies they own.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
-the use of spiders is the first step to setting up a search index of the web. It is the bottom layer here. Necessary, but any kind of page rank algorithm (relevancy?) takes this basic information and tweaks it in their own way. On its own, an index of the web does not determine page rank. This was Google's innovation back in 1998.
-There is a basic issue with user privacy vs Google's business model of using search data as a basis to extract advertising dollars. I have not recently followed their data retention practices, so I accept his statement of 9 months limit or lack of it. However, what has this got to do with the story at hand here about TerraCom's security/privacy issue?
-Someone else (the user in the industry?) thinks that Google is being lax in security because they list the URLs for which the robots.txt file is asking not to be followed. This is not any kind of security failing. I assume that Google's spider is not recursively following the hyperlink that URL represents or executing any code to produce a dynamic web page for that URL. That is the real intention of robots.txt. I think it is a minor issue that Google now lists the URL that begins a blocked branch. As the robots.txt file can be simply ignored, any real attempt to restrict access to the data on that page should require an authentication/authorization step.
Google's security interest in labeling certain sites as dangerous is completely separate from any concern about indexing pages that the owner would rather have private. Such dangerous sites are identified, as best as possible, to contain malware such as a cross site scripting vulnerability.
In Wally's final sentence I don't see the connection between the technical ability of Google to view the content of email on the gmail domain with the security failings of TerraCom/Vcare. A series of statements with varying validity that do not have any obvious connection is, to me at least, the definition of rambling.
Maybe I am being too critical. After all this is just a forum where people, perhaps with limited time, just throw out thoughts to be consumed and either ridiculed or praised. I am loathe to trot out my credentials, but I have worked on network protocols for thirty years and network/computer security for 6 years.
[ link to this | view in chronology ]
I bet there are a few organizations in Washington DC that would be happy to pontificate on Wget being a dangerous hacker tool used by Chinese cyberhackers to perpetrate cyber-9/11 cyberterrorism on our cybercountry.
[ link to this | view in chronology ]
I agree, especially with one of the telcos' crappy attempt to fix their lousy security by replacing the entire web form with their phone number. So much for online commerce!
Source: https://www.terracomwireless.com/m/renewals.html
[ link to this | view in chronology ]
[ link to this | view in chronology ]
People should not be allowed to use computers to freely access the internet
If this fails, the next step would be to distribute printed copies of approved web pages.
[ link to this | view in chronology ]
Trusting Any Company to Securely Protect Consumers
This is so tiring to hear of yet more personal information that was entrusted to a company that once again ends up in the wrong hands. There should be a law that will bring to bear full liability upon the company's CEO, Vice President and entire Board of the corporation when this occurs as well as all top management to the degree of damage it causes or potentially causes.
[ link to this | view in chronology ]