Reporters Find Exposed Personal Data Via Google, Threatened With CFAA Charges

from the sounds-familiar dept

Wed, May 22nd 2013 11:14am — Mike Masnick

In a story that sounds mighty similar to the Andrew "weev" Aurenheimer situation, two reporters from the Scripps News service have been told that they may be hit with Computer Fraud and Abuse Act (CFAA) charges after a Google search they did turned up personal data on 170,000 customers that two telcos left exposed. At issue are low-income customers of YourTel and TerraCom, who provide service for the FCC's Lifeline, a phone service for people who are enrolled in state or federal assistance programs. Apparently, the real issue was a company called Vcare, which the two telcos outsourced certain services to. The Scripps reporters noted that they did nothing more than a Google search:

The unprotected TerraCom and YourTel records came to light through the simplest of tools: a reporter’s Google search of TerraCom.

The records include 44,000 application or certification forms and 127,000 supporting documents or “proof” files, such as scans or photos of food-stamp cards, driver’s licenses, tax records, U.S. and foreign passports, pay stubs and parole letters. Taken together, the records expose residents of at least 26 states.

The application records, drawn from 18 of those states and generally dated from last September through November, list potential customers’ names, signatures, birth dates, home addresses and partial or full Social Security numbers. The proof files, from last September through April, include residents of at least eight remaining states.

Of course, rather than be thankful to the reporters for letting them know about a huge security lapse, or be apologetic for revealing all sorts of key data on their customers, they decided to sue.

However, Vcare and the two telecom companies assert that the reporters "hacked" their way into the data using "automated" methods to access the data. And what was this malicious hacking tool that penetrated the security of Vcare's servers? In a letter sent to Scripps News by Jonathan D. Lee, counsel for both of the cell carriers, Lee said that Vcare's research had shown that the reporters were "using the 'Wget' program to search for and download the Companies' confidential data." GNU Wget is a free and open source tool used for batch downloads over HTTP and FTP. Lee claimed Vcare's investigation found the files were bulk-downloaded via two Scripps IP addresses.

I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA. However, as mentioned, the story does have similarities to the weev case -- except this time we're talking about reporters for a well known news service, rather than someone with a reputation as an internet troll. Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court.

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: cfaa, exposed data, hacking, security vulnerability
Companies: scripps, terracom, vcare, yourtel

68 Comments

If you liked this post, you may also be interested in...

Reader Comments

The First Word

“

%wget www.terracomonline.com/index.php

So now I'm a criminal because I didn't ask my browser to get it?

—BentFranklin

”

Subscribe: RSS

View by: Time | Thread

Anonymous Coward, 22 May 2013 @ 11:19am

wget
didn't Aaron Swartz use wget or something very normal like that?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:26am

Computer Fraud and Abuse
CFAA: Anything done with a computer that we don't approve of is a form of hacking, even if no attempt was made to bypass any sort of access-control mechanism.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:28am

Re: Computer Fraud and Abuse
Google seems the ones to be pressing charges.
[ link to this | view in thread ]
Watchit (profile), 22 May 2013 @ 11:30am

HACKERS! HACKERS! HOW DARE THEY USE GOOGLE TO HACK THAT INFO!!!1
[ link to this | view in thread ]
GMacGuffin (profile), 22 May 2013 @ 11:32am

Pure Half-Assed CYA ...
"I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking ..."

Because if the telcos did not claim the reporters hacked the information, then they are tacitly admitting they posted the personal info of 100+k people openly online. And that's a pretty big oops.
[ link to this | view in thread ]
Chronno S. Trigger (profile), 22 May 2013 @ 11:37am

Isn't there something missing?
I admit I'm not an experienced web admin, I only run a few IIS web servers. Do other web servers not have the most basic of security that's built directly into IIS? I can set a folder to require a password to access and that would go for any file in that folder even if the file was accessed directly. This would stop anyone from accessing any file including Google's spider. I won't even go into their lack of the basic use of robot.txt

For right now, I'm going to hold off judgement on the possible actions against the reporters, but shouldn't there be an extra line to this article? Something along the lines of "TerraCom and YourTel are under investigation for gross negligence."
[ link to this | view in thread ]
This is bullshit, 22 May 2013 @ 11:38am

The article on Ars Technica states something to the effect that Vcare has a requirement not to retain this data (3rd paragraph).

So here's the situation as I (and likely everyone else, except the asshats who are trying to cover up their gross negligence and extreme incompetence) sees it:

1. Company collects data
2. Company is required NOT to hold on to data
3. Company holds on to data in violation of #2
4. Company further shows how retarded they are by making it publicly accessilbe
5. Someone finds publicly accessible documents (that shouldn't exist in the first place) and lets the retards know
6. Retards sue, crying "Hacking! Hacking!"

What an amazing strategy!

Anyone else just wondering how long it's going to be before someone figures this out for what it really is?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:40am

Re: Re: Computer Fraud and Abuse
Citation?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:40am

Re: Re: Computer Fraud and Abuse
This needs some kind of source. What the hell would Google have to do with this other than indexing publically accessible files.

Vcare seems to be one the crying foul, not Google.
[ link to this | view in thread ]
alanbleiweiss (profile), 22 May 2013 @ 11:42am

Companies that have asshats who don't give a crap about security are the norm - its inexcusable. The fact they can even file such a suit, or that the police state can bring charges against people who expose such crap is frightening.

Just this morning in a cursory review of a prospective audit client's online presence, I did a Google search and discovered over 1,000 PDFs of customer invoices they blocked via robots.txt file but since Google now includes URLs of robots blocked files and slaps a "description not available due to robots instruction" that shit is wide open to anyone on the web, no hacking needed.

Companies need to be held accountable for their massive security failings and Google needs to be held accountable as well, even though that shit should have been completely blocked and behind a secure firewall.

The fact that this situation involved a couple reporters gives me little comfort in the notion that asshat companies might eventually be held accountable for causing such massive failings.

We need a comprehensive overhaul of the system, one NOT determined by congress or lobbyists. One that severely penalizes the asshats that cause the problem and rewards the ones who expose it.
[ link to this | view in thread ]
This comment has been flagged by the community. Click here to show it

out_of_the_blue, 22 May 2013 @ 11:43am

Don't go out of your way to write a scraper program!
If there's any data that looks personal, run, don't walk, away. That's just sound advice.

2nd point: "firms' lawyer claims" is all I see of the "threat", and yet Mike implies charges are imminent. So far this is just another of his panics.
[ link to this | view in thread ]
Watchit (profile), 22 May 2013 @ 11:43am

Re: Pure Half-Assed CYA ...
It kinda reminds me of the story a while back about the guy who found a simple security loophole on his banks website and he got charged with CFAA. I can't remember the details exactly though. It pretty much boiled down to "The act of trying to circumvent the website's security counts as hacking, no matter how simple and obviously open the system was."
[ link to this | view in thread ]
Watchit (profile), 22 May 2013 @ 11:44am

Re: Re: Re: Computer Fraud and Abuse
Yeah, one would assume it's the telco's threatening to sue, since it was their data they left open...
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:45am

If anyone did any 'hacking' shouldn't it be google, where they did the web search to find the personal information?

Going after the reports is basically the following happening.
-Person A compiles a list of over 100,000 customers and their personal data
-Person B gets hired by Person A to manage the data, and leaves the data lying out in the open where anyone can grab it.
-Person C finds the data lying around and grabs it and dumps it in a public area where anyone can still read it, but it's in a place with much more traffic.
-Person D finds the dumped personal data, reports it to Person A, and gets charged with hacking.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 11:46am

Re: Don't go out of your way to write a scraper program!
You have never given an ounce of sound advice in your entire life.
[ link to this | view in thread ]
Chronno S. Trigger (profile), 22 May 2013 @ 11:47am

Re: Re: Re: Computer Fraud and Abuse
Google isn't involved, but I can understand where AC's confusion comes from.

"While the reporters claim to have discovered the data with a simple Google search, the firms' lawyer claims they used "automated" means..."

Referencing the company Google and then the telco firm without the name can be confusing. High school level reading skills are required to properly understand that without having to read it two or three times.
[ link to this | view in thread ]
alanbleiweiss (profile), 22 May 2013 @ 11:48am

Re: Don't go out of your way to write a scraper program!
oh crap there's what appears to be a bomb over there. Fuck I better hide my eyes and not say anything.

OOTB you are the personification of troll=stupidity
[ link to this | view in thread ]
kitsune361, 22 May 2013 @ 11:49am

Actually...
I hope they DoJ goes after the reporters for CFAA violations and wins. If the DoJ refuses to go after them, it shows a definite double standard (weev is a troll/AT&T is more important that TerraCom). That and the court battle would be epic, and due to the more sympathetic defendants has a better shot at appeal then weev's case.

Also, as evidenced by the DoJ subpoenas of AP and FoxNews' phone records, the one surefire way to make sure "how abusive these laws are" gets into the goldfish-sized attention span of the professional news media is to use those laws against the professional news media.
[ link to this | view in thread ]
Watchit (profile), 22 May 2013 @ 11:53am

Re: Don't go out of your way to write a scraper program!
I don't understand the first point?

your second point: why would someone imply that someone else is liable to be sued if not to threaten to sue? even if said threat is empty?

Also, I don't think the article implies charges are imminent. It's implied that if the companies decide to sue it will be laughed out of court, so it's unlikely that they actually will.
[ link to this | view in thread ]
silverscarcat (profile), 22 May 2013 @ 12:08pm

Re: Actually...
It's a concussed Goldfish attention span, not a normal goldfish's attention span.

Don't you watch The Daily Show man?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 12:08pm

ROFL I know of at least a few thousand sources that can be abused with a Google search. There is no wall just some clever keywords and search operators.

You'd be pretty surprised what you can find by just playing around with them.
I blame boredom.
[ link to this | view in thread ]
Josh in CharlotteNC (profile), 22 May 2013 @ 12:09pm

Re: Pure Half-Assed CYA ...
http://en.wikipedia.org/wiki/Wget
"Typical usage of GNU Wget consists of invoking it from the command line"

We all know that anyone using the command line instead of a GUI is a dirty hacker.
/s
[ link to this | view in thread ]
BentFranklin (profile), 22 May 2013 @ 12:27pm

%wget www.terracomonline.com/index.php

So now I'm a criminal because I didn't ask my browser to get it?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 12:28pm

Re:
Sort of reminds me of rockstar's response to hot coffee.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 12:31pm

Re: Don't go out of your way to write a scraper program!
Exactly! That way only people who would use it maliciously would get a hold of it.
[ link to this | view in thread ]
Watchit (profile), 22 May 2013 @ 12:39pm

Re: Re: Pure Half-Assed CYA ...
But but the command line is complicated and scary looking! If that's not what hackers use, what is?!
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 12:51pm

Re: Re:
Yeah, but the discovery and reactivation through modding Hot Coffee back into GTASA didn't result in CFFA charges.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 12:55pm

Re:
It's not Google's fault if incompetent companies publish information to the entire world and they happen to stumble over it accidentally. (And anyone using robots.txt as a 'security measure' should be punched in the face with a missile.) The problem here is that companies are greedy, stupid and negligent -- and are looking for scapegoats. Prosecutors, eager to catch themselves a SOOOOPERHACKER and get their names in the press, are happy to oblige.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:00pm

Re: Don't go out of your way to write a scraper program!
so a program that does the same thing as a browser without the html rendering = evul hacker program that you should not write?

better idea: use some fucking common sense and realize wget is not inherently bad
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:03pm

you can most definitely thank the government for this sorry state of affairs. had it stuck to what it said about protecting whistle blowers, then turning turtle and crapping all over them just to protect their own ridiculously stupid mistakes, every freakin' industry has jumped on the band wagon! this was one of things that started the ever increasing slide down the shit chute for the USA. no one is safe from their own law enforcement. is it any wonder why people rebel against them when they get all protection for doing the right thing thrown straight out the window? any wonder why they are getting to not care a toss what happens when companies get caught out over things like this? no good deed goes unpunished is a 'truer words spoken in jest' kind of thing
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:03pm

Re: Don't go out of your way to write a scraper program!
Right. Just leave it there.

Makes perfect sense to let someone in an eastern bloc country find it and "report" it instead via some IRC room.

("Report" in the above sentence means "Sell for profit")

You fucking dumbass.
[ link to this | view in thread ]
Wally (profile), 22 May 2013 @ 1:04pm

I cannot wait to see how long it will take before Google Fanboys to defend Google's lax in security over this.

Google apparently no longer uses its spider crawler to help show people the most relevancy possible when they search...Instead they do relevancy in the way most web advertisers do by what and how many things an individual clicks on and move those items up on said individual's search results based on how many times you click a link. They identify you via IP address and hold your search data for up to 9 months. Within the 9 months if you search on Google, they don't delete previous searches at all when the 9 month mark rolls around.

If Google's CEO has the power to internally check any user's e-mails without a password...imagine what these reporters who pointed out the security flaw could have done.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:20pm

Re: wget
See! Right here! Proof that they were hacking! Aaron Swartz used it!
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:20pm

Re: wget
See! Right here! Proof that they were hacking! Aaron Swartz used it!
[ link to this | view in thread ]
Chuck Norris' Enemy (deceased) (profile), 22 May 2013 @ 1:24pm

Re:
GOOOOOGLEZZZ!!!1!

RTFA again and grasp who is at fault here for lax security.
[ link to this | view in thread ]
Gwiz (profile), 22 May 2013 @ 1:26pm

Re:
I cannot wait to see how long it will take before Google Fanboys to defend Google's lax in security over this.

WTF are you talking about, Wally?

Why would Google have anything to do with some other company's lack of security on the pages they have facing the web?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:41pm

Re:
How google orders searches and what it does with previous search data has nothing to do with this company making personal data it shouldn't have even had searchable.

You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don't want to give them access to the data you create you can feel free not to use their free services.
[ link to this | view in thread ]
alanbleiweiss (profile), 22 May 2013 @ 1:46pm

Re: Re:
While Google is not ultimately responsible for other the administration of other sites, they have chosen to take a stand against hacked sites and malware ridden sites, going so far as to block them from search results pages.

Google claims to be on the side of security, yet they ignore the robots.txt file's disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as "disallow" in that file. As such, they are implicit in the breach.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 1:47pm

Re: Re:
I just checked the info page for wget, it respects the robots.txt file when recursively downloading a site. Therefore it seems that the site had not even blocked indexing of the documents.
[ link to this | view in thread ]
alanbleiweiss (profile), 22 May 2013 @ 1:48pm

Re: Re: Re:
uh complicit? explicit? expletive deleted? #FastRantTypingStrikesAgain
[ link to this | view in thread ]
Gwiz (profile), 22 May 2013 @ 2:01pm

Re: Re: Re:
... yet they ignore the robots.txt file's disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as "disallow" in that file.

Do you have a citation for that? It's not that I don't believe you - just haven't heard that one before.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 2:13pm

I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA.

I bet there are a few organizations in Washington DC that would be happy to pontificate on Wget being a dangerous hacker tool used by Chinese cyberhackers to perpetrate cyber-9/11 cyberterrorism on our cybercountry.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 2:19pm

Re:
You have no idea what you're rambling about. Are you trying to take over OotB's role?
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 2:40pm

Re: Re: Re:
wget respects the robots.txt file when scraping a website. Therefore no evidence that Google is ignoring it either. Much more likely a total admin failure on the TerraCom site that left the data as public in a public directory.
[ link to this | view in thread ]
alanbleiweiss (profile), 22 May 2013 @ 2:52pm

Re: Re: Re: Re:
A citation for it? yeah half the search marketing industry. As an SEO audit professional I routinely encounter it. They list URLs, but beneath them, where a description of the file would go is a statement

A description for this result is not available because of this site's robots.txt � learn more.

They do not show all URLs that are blocked in the robots file, however if their (extremely flawed) system sees enough "other indicators" to countermand the robots instruction, they ignore that instruction.

"Other indicators" is most often "a link to that file somewhere on the site itself or pointing to the URL from another site.

The "learn more" link points to this Google answer page where it states:

While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.

Which is complete bullshit. because while they're not actually indexing the CONTENT of the page, they're indexing the URL.

So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.
[ link to this | view in thread ]
Sheogorath (profile), 22 May 2013 @ 5:43pm

"Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court."
I agree, especially with one of the telcos' crappy attempt to fix their lousy security by replacing the entire web form with their phone number. So much for online commerce!
Source: https://www.terracomwireless.com/m/renewals.html
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 6:26pm

Re: Re:
Where is this imaginary button? Unless you're talking about the "Remove Search" or "do not save search" in the options menue Google provides...but then again those are only saved as cookies on YOUR computer so YOUR system is tracked even without that option. The truth is that Google does not provide a magic button to users of their search engine to have their information on where they clicked stricken from Google's servers.

3/4 of Google's revenu in 2010 came from advertising and less than 1/100 came from search technologies. When you do the math, you start to realuze Google's priorities have shifted towards catering to advertisers rather than web users.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 6:30pm

Re: Re:
Are you sure he's rambling, or are you being the blatant fanboy jackass that Wally knew would come to Google's defense. He was even backed up by a user in the industry so I really think Wally knows more than you wish people to believe by your statement towards him.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 6:35pm

Re: Re:
"You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don't want to give them access to the data you create you can feel free not to use their free services."

Although on second read...I recall that one could do this but only if one is logged onto Google+...which means they are still tracking your every move and can target ads to you with the advertising companies they own.
[ link to this | view in thread ]
Rekrul, 22 May 2013 @ 7:11pm

It's getting to the point where you can be charged under the CFAA for simply looking at someone else's computer screen.
[ link to this | view in thread ]
Anonymous Coward, 22 May 2013 @ 9:33pm

Re: Re: Re:
It's actually even worse than that.

Neither Google nor wget magically know what files are available on a site. For the files to show up in a recursive wget (or to the Google spider, for that matter), it means that the files had to be actively linked to from other accessible pages on the site. Somewhere on their public-facing website, there was a link pointing to all those thousands of insecure, confidential documents that the company wasn't even supposed to be keeping in the first place.
[ link to this | view in thread ]
Donglebert the Needlessly Obtuse, 23 May 2013 @ 3:38am

People should not be allowed to use computers to freely access the internet
They should only be allowed to view passive pages via a screen in a reasonably public place, say, for example, their living rooms. Keyboards, mice, and touchscreens should be banned because they encourage hacking.

If this fails, the next step would be to distribute printed copies of approved web pages.
[ link to this | view in thread ]
kitsune361, 23 May 2013 @ 7:41am

Re: Re: Actually...
Unfortunately... not as often as I probably should.
[ link to this | view in thread ]
I Forgot, 23 May 2013 @ 9:35am

Trusting Any Company to Securely Protect Consumers
This is systematically problematic with corporations who require extended personal data of its customers. This case does give the appearance of these companies attempting to cover their own arses after they mishandled or neglected to secure even the most simple of data breaches after the fact.

This is so tiring to hear of yet more personal information that was entrusted to a company that once again ends up in the wrong hands. There should be a law that will bring to bear full liability upon the company's CEO, Vice President and entire Board of the corporation when this occurs as well as all top management to the degree of damage it causes or potentially causes.
[ link to this | view in thread ]
I Forgot, 23 May 2013 @ 9:52am

Re: gOOGllE dumps data into public searchability
Fine them $1.00
[ link to this | view in thread ]
nasch (profile), 23 May 2013 @ 11:25am

Re: Re: Re: Re: Re:
Which is complete bullshit. because while they're not actually indexing the CONTENT of the page, they're indexing the URL.

So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.

You're not suggesting this is a security issue, are you? Because if you're relying on robots.txt to secure sensitive information, you're doing it very, very wrong. So what is the problem with this behavior by Google? I'm not saying there isn't one, I'm just not sure I'm even clear on why you would use robots.txt to keep search engines away. If it's to save bandwidth, then this doesn't cause a problem since Google isn't downloading the page.
[ link to this | view in thread ]
alanbleiweiss (profile), 23 May 2013 @ 11:57am

Re: Re: Re: Re: Re: Re:
I'm not relying on it. I'm a forensic SEO consultant with a fair amount of digital security experience. What I'm saying is sites need to get their security methods right. At the same time, Google claims to be a security backstop, yet they allow those URLs into their system.
[ link to this | view in thread ]
nasch (profile), 24 May 2013 @ 8:59am

Re: Re: Re: Re: Re: Re: Re:
What I'm saying is sites need to get their security methods right. At the same time, Google claims to be a security backstop, yet they allow those URLs into their system.

Google is saying robots.txt is a security measure? If that's what you're saying, do you have a reference? If not, what do you mean by security backstop?
[ link to this | view in thread ]
alanbleiweiss (profile), 24 May 2013 @ 10:13am

Re: Re: Re: Re: Re: Re: Re: Re:
No, Google is NOT saying that. Yet their system is more than capable of keeping URLs out of the system that are listed in the robots file so there's no excuse why they, as a supposed security advocate, shouldn't honor robots.txt instructions. "Disallow" is pretty clear in its definition.
[ link to this | view in thread ]
nasch (profile), 24 May 2013 @ 11:31am

Re: Re: Re: Re: Re: Re: Re: Re: Re:
No, Google is NOT saying that. Yet their system is more than capable of keeping URLs out of the system that are listed in the robots file so there's no excuse why they, as a supposed security advocate, shouldn't honor robots.txt instructions.

But you keep mentioning security in connection with robots.txt. If you acknowledge that it's not a security measure, and Google doesn't say it's a security measure, why are you still talking about security?

Also, why is this a big deal? I'm not trying to defend Google, I just really don't see why it's important. Can you explain it?
[ link to this | view in thread ]
alanbleiweiss (profile), 24 May 2013 @ 1:52pm

Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:
Because its an opportunity for Google to help improve the securing of private information on the web. Since they already take proactive steps in other areas to improve security online, why not here?

For example - they proactively block sites their system detects that have malware or viruses. They don't have to. Its the responsibility of site owners to ensure their sites don't have malware or viruses baked in. Yet Google has chosen to help.

This is no different.
[ link to this | view in thread ]
nasch (profile), 24 May 2013 @ 3:23pm

Re: Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:
I don't get it. Earlier you agreed that robots.txt is not a security measure. Now you're saying that Google should help with security related to robots.txt. It cant be both!
[ link to this | view in thread ]
alanbleiweiss (profile), 24 May 2013 @ 4:12pm

crticial thinking
it sure as hell can be both. While robots.txt is not by original nature related to search engines, a means of security, Google has the power and resources to respect it for the sake of security. If you don't grasp that, not my problem.
[ link to this | view in thread ]
aldestrawk (profile), 24 May 2013 @ 4:22pm

Re: Re: Re:
The use of robots.txt is in no way a security measure. It was never intended to be and definitely should not be used as such. It is simply intended to relieve servers of unnecessary traffic as a result of spiders actions. Any script kiddie can do the same thing as a spider and intentionally ignore the request that robots.txt files represent.
[ link to this | view in thread ]
alanbleiweiss (profile), 24 May 2013 @ 4:32pm

it's called innovation
Just because something did not have an original intent to be used in a certain way does not mean it should not be used in a new way if that way is innovative and provides value to the world.

Anything other than that understanding is called myopic thinking.
[ link to this | view in thread ]
nasch (profile), 24 May 2013 @ 7:18pm

Re: crticial thinking
While robots.txt is not by original nature related to search engines, a means of security, Google has the power and resources to respect it for the sake of security.

So you're saying it was never intended as a security measure, it is not appropriate to rely on it for security, Google does not recommend it be so used, and Google should try to make it as effective a security tool as possible? No wonder I was confused. :-)
[ link to this | view in thread ]
nasch (profile), 24 May 2013 @ 7:20pm

Re: it's called innovation
Just because something did not have an original intent to be used in a certain way does not mean it should not be used in a new way if that way is innovative and provides value to the world.

Yes, but if using it in that way is stupid and ineffective, then that doesn't provide value. Maybe the illusion of value, which is even worse than nothing.
[ link to this | view in thread ]
aldestrawk (profile), 24 May 2013 @ 9:00pm

Re: Re: Re:
Wally has pointed out several issues with how Google operates. They are somewhat related but not closely enough for me to figure out what point he is trying to make.

-the use of spiders is the first step to setting up a search index of the web. It is the bottom layer here. Necessary, but any kind of page rank algorithm (relevancy?) takes this basic information and tweaks it in their own way. On its own, an index of the web does not determine page rank. This was Google's innovation back in 1998.

-There is a basic issue with user privacy vs Google's business model of using search data as a basis to extract advertising dollars. I have not recently followed their data retention practices, so I accept his statement of 9 months limit or lack of it. However, what has this got to do with the story at hand here about TerraCom's security/privacy issue?

-Someone else (the user in the industry?) thinks that Google is being lax in security because they list the URLs for which the robots.txt file is asking not to be followed. This is not any kind of security failing. I assume that Google's spider is not recursively following the hyperlink that URL represents or executing any code to produce a dynamic web page for that URL. That is the real intention of robots.txt. I think it is a minor issue that Google now lists the URL that begins a blocked branch. As the robots.txt file can be simply ignored, any real attempt to restrict access to the data on that page should require an authentication/authorization step.
Google's security interest in labeling certain sites as dangerous is completely separate from any concern about indexing pages that the owner would rather have private. Such dangerous sites are identified, as best as possible, to contain malware such as a cross site scripting vulnerability.

In Wally's final sentence I don't see the connection between the technical ability of Google to view the content of email on the gmail domain with the security failings of TerraCom/Vcare. A series of statements with varying validity that do not have any obvious connection is, to me at least, the definition of rambling.

Maybe I am being too critical. After all this is just a forum where people, perhaps with limited time, just throw out thoughts to be consumed and either ridiculed or praised. I am loathe to trot out my credentials, but I have worked on network protocols for thirty years and network/computer security for 6 years.
[ link to this | view in thread ]