Reporters Find Exposed Personal Data Via Google, Threatened With CFAA Charges

from the sounds-familiar dept

In a story that sounds mighty similar to the Andrew "weev" Aurenheimer situation, two reporters from the Scripps News service have been told that they may be hit with Computer Fraud and Abuse Act (CFAA) charges after a Google search they did turned up personal data on 170,000 customers that two telcos left exposed. At issue are low-income customers of YourTel and TerraCom, who provide service for the FCC's Lifeline, a phone service for people who are enrolled in state or federal assistance programs. Apparently, the real issue was a company called Vcare, which the two telcos outsourced certain services to. The Scripps reporters noted that they did nothing more than a Google search:
The unprotected TerraCom and YourTel records came to light through the simplest of tools: a reporter’s Google search of TerraCom.

The records include 44,000 application or certification forms and 127,000 supporting documents or “proof” files, such as scans or photos of food-stamp cards, driver’s licenses, tax records, U.S. and foreign passports, pay stubs and parole letters. Taken together, the records expose residents of at least 26 states.

The application records, drawn from 18 of those states and generally dated from last September through November, list potential customers’ names, signatures, birth dates, home addresses and partial or full Social Security numbers. The proof files, from last September through April, include residents of at least eight remaining states.
Of course, rather than be thankful to the reporters for letting them know about a huge security lapse, or be apologetic for revealing all sorts of key data on their customers, they decided to sue.
However, Vcare and the two telecom companies assert that the reporters "hacked" their way into the data using "automated" methods to access the data. And what was this malicious hacking tool that penetrated the security of Vcare's servers? In a letter sent to Scripps News by Jonathan D. Lee, counsel for both of the cell carriers, Lee said that Vcare's research had shown that the reporters were "using the 'Wget' program to search for and download the Companies' confidential data." GNU Wget is a free and open source tool used for batch downloads over HTTP and FTP. Lee claimed Vcare's investigation found the files were bulk-downloaded via two Scripps IP addresses.
I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA. However, as mentioned, the story does have similarities to the weev case -- except this time we're talking about reporters for a well known news service, rather than someone with a reputation as an internet troll. Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court.
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: cfaa, exposed data, hacking, security vulnerability
Companies: scripps, terracom, vcare, yourtel


Reader Comments

Subscribe: RSS

View by: Time | Thread


  1. identicon
    Anonymous Coward, 22 May 2013 @ 11:19am

    wget

    didn't Aaron Swartz use wget or something very normal like that?

    link to this | view in thread ]

  2. identicon
    Anonymous Coward, 22 May 2013 @ 11:26am

    Computer Fraud and Abuse

    CFAA: Anything done with a computer that we don't approve of is a form of hacking, even if no attempt was made to bypass any sort of access-control mechanism.

    link to this | view in thread ]

  3. identicon
    Anonymous Coward, 22 May 2013 @ 11:28am

    Re: Computer Fraud and Abuse

    Google seems the ones to be pressing charges.

    link to this | view in thread ]

  4. icon
    Watchit (profile), 22 May 2013 @ 11:30am

    HACKERS! HACKERS! HOW DARE THEY USE GOOGLE TO HACK THAT INFO!!!1

    link to this | view in thread ]

  5. icon
    GMacGuffin (profile), 22 May 2013 @ 11:32am

    Pure Half-Assed CYA ...

    "I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking ..."

    Because if the telcos did not claim the reporters hacked the information, then they are tacitly admitting they posted the personal info of 100+k people openly online. And that's a pretty big oops.

    link to this | view in thread ]

  6. icon
    Chronno S. Trigger (profile), 22 May 2013 @ 11:37am

    Isn't there something missing?

    I admit I'm not an experienced web admin, I only run a few IIS web servers. Do other web servers not have the most basic of security that's built directly into IIS? I can set a folder to require a password to access and that would go for any file in that folder even if the file was accessed directly. This would stop anyone from accessing any file including Google's spider. I won't even go into their lack of the basic use of robot.txt

    For right now, I'm going to hold off judgement on the possible actions against the reporters, but shouldn't there be an extra line to this article? Something along the lines of "TerraCom and YourTel are under investigation for gross negligence."

    link to this | view in thread ]

  7. identicon
    This is bullshit, 22 May 2013 @ 11:38am

    The article on Ars Technica states something to the effect that Vcare has a requirement not to retain this data (3rd paragraph).

    So here's the situation as I (and likely everyone else, except the asshats who are trying to cover up their gross negligence and extreme incompetence) sees it:

    1. Company collects data
    2. Company is required NOT to hold on to data
    3. Company holds on to data in violation of #2
    4. Company further shows how retarded they are by making it publicly accessilbe
    5. Someone finds publicly accessible documents (that shouldn't exist in the first place) and lets the retards know
    6. Retards sue, crying "Hacking! Hacking!"

    What an amazing strategy!

    Anyone else just wondering how long it's going to be before someone figures this out for what it really is?

    link to this | view in thread ]

  8. identicon
    Anonymous Coward, 22 May 2013 @ 11:40am

    Re: Re: Computer Fraud and Abuse

    Citation?

    link to this | view in thread ]

  9. identicon
    Anonymous Coward, 22 May 2013 @ 11:40am

    Re: Re: Computer Fraud and Abuse

    This needs some kind of source. What the hell would Google have to do with this other than indexing publically accessible files.

    Vcare seems to be one the crying foul, not Google.

    link to this | view in thread ]

  10. icon
    alanbleiweiss (profile), 22 May 2013 @ 11:42am

    Companies that have asshats who don't give a crap about security are the norm - its inexcusable. The fact they can even file such a suit, or that the police state can bring charges against people who expose such crap is frightening.

    Just this morning in a cursory review of a prospective audit client's online presence, I did a Google search and discovered over 1,000 PDFs of customer invoices they blocked via robots.txt file but since Google now includes URLs of robots blocked files and slaps a "description not available due to robots instruction" that shit is wide open to anyone on the web, no hacking needed.

    Companies need to be held accountable for their massive security failings and Google needs to be held accountable as well, even though that shit should have been completely blocked and behind a secure firewall.

    The fact that this situation involved a couple reporters gives me little comfort in the notion that asshat companies might eventually be held accountable for causing such massive failings.

    We need a comprehensive overhaul of the system, one NOT determined by congress or lobbyists. One that severely penalizes the asshats that cause the problem and rewards the ones who expose it.

    link to this | view in thread ]

  11. This comment has been flagged by the community. Click here to show it
    identicon
    out_of_the_blue, 22 May 2013 @ 11:43am

    Don't go out of your way to write a scraper program!

    If there's any data that looks personal, run, don't walk, away. That's just sound advice.

    2nd point: "firms' lawyer claims" is all I see of the "threat", and yet Mike implies charges are imminent. So far this is just another of his panics.

    link to this | view in thread ]

  12. icon
    Watchit (profile), 22 May 2013 @ 11:43am

    Re: Pure Half-Assed CYA ...

    It kinda reminds me of the story a while back about the guy who found a simple security loophole on his banks website and he got charged with CFAA. I can't remember the details exactly though. It pretty much boiled down to "The act of trying to circumvent the website's security counts as hacking, no matter how simple and obviously open the system was."

    link to this | view in thread ]

  13. icon
    Watchit (profile), 22 May 2013 @ 11:44am

    Re: Re: Re: Computer Fraud and Abuse

    Yeah, one would assume it's the telco's threatening to sue, since it was their data they left open...

    link to this | view in thread ]

  14. identicon
    Anonymous Coward, 22 May 2013 @ 11:45am

    If anyone did any 'hacking' shouldn't it be google, where they did the web search to find the personal information?

    Going after the reports is basically the following happening.
    -Person A compiles a list of over 100,000 customers and their personal data
    -Person B gets hired by Person A to manage the data, and leaves the data lying out in the open where anyone can grab it.
    -Person C finds the data lying around and grabs it and dumps it in a public area where anyone can still read it, but it's in a place with much more traffic.
    -Person D finds the dumped personal data, reports it to Person A, and gets charged with hacking.

    link to this | view in thread ]

  15. identicon
    Anonymous Coward, 22 May 2013 @ 11:46am

    Re: Don't go out of your way to write a scraper program!

    You have never given an ounce of sound advice in your entire life.

    link to this | view in thread ]

  16. icon
    Chronno S. Trigger (profile), 22 May 2013 @ 11:47am

    Re: Re: Re: Computer Fraud and Abuse

    Google isn't involved, but I can understand where AC's confusion comes from.

    "While the reporters claim to have discovered the data with a simple Google search, the firms' lawyer claims they used "automated" means..."

    Referencing the company Google and then the telco firm without the name can be confusing. High school level reading skills are required to properly understand that without having to read it two or three times.

    link to this | view in thread ]

  17. icon
    alanbleiweiss (profile), 22 May 2013 @ 11:48am

    Re: Don't go out of your way to write a scraper program!

    oh crap there's what appears to be a bomb over there. Fuck I better hide my eyes and not say anything.

    OOTB you are the personification of troll=stupidity

    link to this | view in thread ]

  18. identicon
    kitsune361, 22 May 2013 @ 11:49am

    Actually...

    I hope they DoJ goes after the reporters for CFAA violations and wins. If the DoJ refuses to go after them, it shows a definite double standard (weev is a troll/AT&T is more important that TerraCom). That and the court battle would be epic, and due to the more sympathetic defendants has a better shot at appeal then weev's case.

    Also, as evidenced by the DoJ subpoenas of AP and FoxNews' phone records, the one surefire way to make sure "how abusive these laws are" gets into the goldfish-sized attention span of the professional news media is to use those laws against the professional news media.

    link to this | view in thread ]

  19. icon
    Watchit (profile), 22 May 2013 @ 11:53am

    Re: Don't go out of your way to write a scraper program!

    I don't understand the first point?

    your second point: why would someone imply that someone else is liable to be sued if not to threaten to sue? even if said threat is empty?

    Also, I don't think the article implies charges are imminent. It's implied that if the companies decide to sue it will be laughed out of court, so it's unlikely that they actually will.

    link to this | view in thread ]

  20. icon
    silverscarcat (profile), 22 May 2013 @ 12:08pm

    Re: Actually...

    It's a concussed Goldfish attention span, not a normal goldfish's attention span.

    Don't you watch The Daily Show man?

    link to this | view in thread ]

  21. identicon
    Anonymous Coward, 22 May 2013 @ 12:08pm

    ROFL I know of at least a few thousand sources that can be abused with a Google search. There is no wall just some clever keywords and search operators.

    You'd be pretty surprised what you can find by just playing around with them.
    I blame boredom.

    link to this | view in thread ]

  22. icon
    Josh in CharlotteNC (profile), 22 May 2013 @ 12:09pm

    Re: Pure Half-Assed CYA ...

    http://en.wikipedia.org/wiki/Wget
    "Typical usage of GNU Wget consists of invoking it from the command line"

    We all know that anyone using the command line instead of a GUI is a dirty hacker.
    /s

    link to this | view in thread ]

  23. icon
    BentFranklin (profile), 22 May 2013 @ 12:27pm

    %wget www.terracomonline.com/index.php

    So now I'm a criminal because I didn't ask my browser to get it?

    link to this | view in thread ]

  24. identicon
    Anonymous Coward, 22 May 2013 @ 12:28pm

    Re:

    Sort of reminds me of rockstar's response to hot coffee.

    link to this | view in thread ]

  25. identicon
    Anonymous Coward, 22 May 2013 @ 12:31pm

    Re: Don't go out of your way to write a scraper program!

    Exactly! That way only people who would use it maliciously would get a hold of it.

    link to this | view in thread ]

  26. icon
    Watchit (profile), 22 May 2013 @ 12:39pm

    Re: Re: Pure Half-Assed CYA ...

    But but the command line is complicated and scary looking! If that's not what hackers use, what is?!

    link to this | view in thread ]

  27. identicon
    Anonymous Coward, 22 May 2013 @ 12:51pm

    Re: Re:

    Yeah, but the discovery and reactivation through modding Hot Coffee back into GTASA didn't result in CFFA charges.

    link to this | view in thread ]

  28. identicon
    Anonymous Coward, 22 May 2013 @ 12:55pm

    Re:

    It's not Google's fault if incompetent companies publish information to the entire world and they happen to stumble over it accidentally. (And anyone using robots.txt as a 'security measure' should be punched in the face with a missile.) The problem here is that companies are greedy, stupid and negligent -- and are looking for scapegoats. Prosecutors, eager to catch themselves a SOOOOPERHACKER and get their names in the press, are happy to oblige.

    link to this | view in thread ]

  29. identicon
    Anonymous Coward, 22 May 2013 @ 1:00pm

    Re: Don't go out of your way to write a scraper program!

    so a program that does the same thing as a browser without the html rendering = evul hacker program that you should not write?

    better idea: use some fucking common sense and realize wget is not inherently bad

    link to this | view in thread ]

  30. identicon
    Anonymous Coward, 22 May 2013 @ 1:03pm

    you can most definitely thank the government for this sorry state of affairs. had it stuck to what it said about protecting whistle blowers, then turning turtle and crapping all over them just to protect their own ridiculously stupid mistakes, every freakin' industry has jumped on the band wagon! this was one of things that started the ever increasing slide down the shit chute for the USA. no one is safe from their own law enforcement. is it any wonder why people rebel against them when they get all protection for doing the right thing thrown straight out the window? any wonder why they are getting to not care a toss what happens when companies get caught out over things like this? no good deed goes unpunished is a 'truer words spoken in jest' kind of thing

    link to this | view in thread ]

  31. identicon
    Anonymous Coward, 22 May 2013 @ 1:03pm

    Re: Don't go out of your way to write a scraper program!

    Right. Just leave it there.

    Makes perfect sense to let someone in an eastern bloc country find it and "report" it instead via some IRC room.

    ("Report" in the above sentence means "Sell for profit")

    You fucking dumbass.

    link to this | view in thread ]

  32. icon
    Wally (profile), 22 May 2013 @ 1:04pm

    I cannot wait to see how long it will take before Google Fanboys to defend Google's lax in security over this.


    Google apparently no longer uses its spider crawler to help show people the most relevancy possible when they search...Instead they do relevancy in the way most web advertisers do by what and how many things an individual clicks on and move those items up on said individual's search results based on how many times you click a link. They identify you via IP address and hold your search data for up to 9 months. Within the 9 months if you search on Google, they don't delete previous searches at all when the 9 month mark rolls around.

    If Google's CEO has the power to internally check any user's e-mails without a password...imagine what these reporters who pointed out the security flaw could have done.

    link to this | view in thread ]

  33. identicon
    Anonymous Coward, 22 May 2013 @ 1:20pm

    Re: wget

    See! Right here! Proof that they were hacking! Aaron Swartz used it!

    link to this | view in thread ]

  34. identicon
    Anonymous Coward, 22 May 2013 @ 1:20pm

    Re: wget

    See! Right here! Proof that they were hacking! Aaron Swartz used it!

    link to this | view in thread ]

  35. icon
    Chuck Norris' Enemy (deceased) (profile), 22 May 2013 @ 1:24pm

    Re:

    GOOOOOGLEZZZ!!!1!

    RTFA again and grasp who is at fault here for lax security.

    link to this | view in thread ]

  36. icon
    Gwiz (profile), 22 May 2013 @ 1:26pm

    Re:

    I cannot wait to see how long it will take before Google Fanboys to defend Google's lax in security over this.


    WTF are you talking about, Wally?

    Why would Google have anything to do with some other company's lack of security on the pages they have facing the web?

    link to this | view in thread ]

  37. identicon
    Anonymous Coward, 22 May 2013 @ 1:41pm

    Re:

    How google orders searches and what it does with previous search data has nothing to do with this company making personal data it shouldn't have even had searchable.

    You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don't want to give them access to the data you create you can feel free not to use their free services.

    link to this | view in thread ]

  38. icon
    alanbleiweiss (profile), 22 May 2013 @ 1:46pm

    Re: Re:

    While Google is not ultimately responsible for other the administration of other sites, they have chosen to take a stand against hacked sites and malware ridden sites, going so far as to block them from search results pages.

    Google claims to be on the side of security, yet they ignore the robots.txt file's disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as "disallow" in that file. As such, they are implicit in the breach.

    link to this | view in thread ]

  39. identicon
    Anonymous Coward, 22 May 2013 @ 1:47pm

    Re: Re:

    I just checked the info page for wget, it respects the robots.txt file when recursively downloading a site. Therefore it seems that the site had not even blocked indexing of the documents.

    link to this | view in thread ]

  40. icon
    alanbleiweiss (profile), 22 May 2013 @ 1:48pm

    Re: Re: Re:

    uh complicit? explicit? expletive deleted? #FastRantTypingStrikesAgain

    link to this | view in thread ]

  41. icon
    Gwiz (profile), 22 May 2013 @ 2:01pm

    Re: Re: Re:

    ... yet they ignore the robots.txt file's disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as "disallow" in that file.

    Do you have a citation for that? It's not that I don't believe you - just haven't heard that one before.

    link to this | view in thread ]

  42. identicon
    Anonymous Coward, 22 May 2013 @ 2:13pm

    I'm not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA.

    I bet there are a few organizations in Washington DC that would be happy to pontificate on Wget being a dangerous hacker tool used by Chinese cyberhackers to perpetrate cyber-9/11 cyberterrorism on our cybercountry.

    link to this | view in thread ]

  43. identicon
    Anonymous Coward, 22 May 2013 @ 2:19pm

    Re:

    You have no idea what you're rambling about. Are you trying to take over OotB's role?

    link to this | view in thread ]

  44. identicon
    Anonymous Coward, 22 May 2013 @ 2:40pm

    Re: Re: Re:

    wget respects the robots.txt file when scraping a website. Therefore no evidence that Google is ignoring it either. Much more likely a total admin failure on the TerraCom site that left the data as public in a public directory.

    link to this | view in thread ]

  45. icon
    alanbleiweiss (profile), 22 May 2013 @ 2:52pm

    Re: Re: Re: Re:

    A citation for it? yeah half the search marketing industry. As an SEO audit professional I routinely encounter it. They list URLs, but beneath them, where a description of the file would go is a statement

    A description for this result is not available because of this site's robots.txt – learn more.


    They do not show all URLs that are blocked in the robots file, however if their (extremely flawed) system sees enough "other indicators" to countermand the robots instruction, they ignore that instruction.

    "Other indicators" is most often "a link to that file somewhere on the site itself or pointing to the URL from another site.

    The "learn more" link points to this Google answer page where it states:

    While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.


    Which is complete bullshit. because while they're not actually indexing the CONTENT of the page, they're indexing the URL.

    So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.

    link to this | view in thread ]

  46. icon
    Sheogorath (profile), 22 May 2013 @ 5:43pm

    "Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court."
    I agree, especially with one of the telcos' crappy attempt to fix their lousy security by replacing the entire web form with their phone number. So much for online commerce!
    Source: https://www.terracomwireless.com/m/renewals.html

    link to this | view in thread ]

  47. identicon
    Anonymous Coward, 22 May 2013 @ 6:26pm

    Re: Re:

    Where is this imaginary button? Unless you're talking about the "Remove Search" or "do not save search" in the options menue Google provides...but then again those are only saved as cookies on YOUR computer so YOUR system is tracked even without that option. The truth is that Google does not provide a magic button to users of their search engine to have their information on where they clicked stricken from Google's servers.

    3/4 of Google's revenu in 2010 came from advertising and less than 1/100 came from search technologies. When you do the math, you start to realuze Google's priorities have shifted towards catering to advertisers rather than web users.

    link to this | view in thread ]

  48. identicon
    Anonymous Coward, 22 May 2013 @ 6:30pm

    Re: Re:

    Are you sure he's rambling, or are you being the blatant fanboy jackass that Wally knew would come to Google's defense. He was even backed up by a user in the industry so I really think Wally knows more than you wish people to believe by your statement towards him.

    link to this | view in thread ]

  49. identicon
    Anonymous Coward, 22 May 2013 @ 6:35pm

    Re: Re:

    "You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don't want to give them access to the data you create you can feel free not to use their free services."

    Although on second read...I recall that one could do this but only if one is logged onto Google+...which means they are still tracking your every move and can target ads to you with the advertising companies they own.

    link to this | view in thread ]

  50. identicon
    Rekrul, 22 May 2013 @ 7:11pm

    It's getting to the point where you can be charged under the CFAA for simply looking at someone else's computer screen.

    link to this | view in thread ]

  51. identicon
    Anonymous Coward, 22 May 2013 @ 9:33pm

    Re: Re: Re:

    It's actually even worse than that.

    Neither Google nor wget magically know what files are available on a site. For the files to show up in a recursive wget (or to the Google spider, for that matter), it means that the files had to be actively linked to from other accessible pages on the site. Somewhere on their public-facing website, there was a link pointing to all those thousands of insecure, confidential documents that the company wasn't even supposed to be keeping in the first place.

    link to this | view in thread ]

  52. identicon
    Donglebert the Needlessly Obtuse, 23 May 2013 @ 3:38am

    People should not be allowed to use computers to freely access the internet

    They should only be allowed to view passive pages via a screen in a reasonably public place, say, for example, their living rooms. Keyboards, mice, and touchscreens should be banned because they encourage hacking.

    If this fails, the next step would be to distribute printed copies of approved web pages.

    link to this | view in thread ]

  53. identicon
    kitsune361, 23 May 2013 @ 7:41am

    Re: Re: Actually...

    Unfortunately... not as often as I probably should.

    link to this | view in thread ]

  54. identicon
    I Forgot, 23 May 2013 @ 9:35am

    Trusting Any Company to Securely Protect Consumers

    This is systematically problematic with corporations who require extended personal data of its customers. This case does give the appearance of these companies attempting to cover their own arses after they mishandled or neglected to secure even the most simple of data breaches after the fact.

    This is so tiring to hear of yet more personal information that was entrusted to a company that once again ends up in the wrong hands. There should be a law that will bring to bear full liability upon the company's CEO, Vice President and entire Board of the corporation when this occurs as well as all top management to the degree of damage it causes or potentially causes.

    link to this | view in thread ]

  55. identicon
    I Forgot, 23 May 2013 @ 9:52am

    Re: gOOGllE dumps data into public searchability

    Fine them $1.00

    link to this | view in thread ]

  56. icon
    nasch (profile), 23 May 2013 @ 11:25am

    Re: Re: Re: Re: Re:

    Which is complete bullshit. because while they're not actually indexing the CONTENT of the page, they're indexing the URL.

    So in the case of a URL that includes variable parameters labeled with "order" or "customerID" or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.


    You're not suggesting this is a security issue, are you? Because if you're relying on robots.txt to secure sensitive information, you're doing it very, very wrong. So what is the problem with this behavior by Google? I'm not saying there isn't one, I'm just not sure I'm even clear on why you would use robots.txt to keep search engines away. If it's to save bandwidth, then this doesn't cause a problem since Google isn't downloading the page.

    link to this | view in thread ]

  57. icon
    alanbleiweiss (profile), 23 May 2013 @ 11:57am

    Re: Re: Re: Re: Re: Re:

    I'm not relying on it. I'm a forensic SEO consultant with a fair amount of digital security experience. What I'm saying is sites need to get their security methods right. At the same time, Google claims to be a security backstop, yet they allow those URLs into their system.

    link to this | view in thread ]

  58. icon
    nasch (profile), 24 May 2013 @ 8:59am

    Re: Re: Re: Re: Re: Re: Re:

    What I'm saying is sites need to get their security methods right. At the same time, Google claims to be a security backstop, yet they allow those URLs into their system.

    Google is saying robots.txt is a security measure? If that's what you're saying, do you have a reference? If not, what do you mean by security backstop?

    link to this | view in thread ]

  59. icon
    alanbleiweiss (profile), 24 May 2013 @ 10:13am

    Re: Re: Re: Re: Re: Re: Re: Re:

    No, Google is NOT saying that. Yet their system is more than capable of keeping URLs out of the system that are listed in the robots file so there's no excuse why they, as a supposed security advocate, shouldn't honor robots.txt instructions. "Disallow" is pretty clear in its definition.

    link to this | view in thread ]

  60. icon
    nasch (profile), 24 May 2013 @ 11:31am

    Re: Re: Re: Re: Re: Re: Re: Re: Re:

    No, Google is NOT saying that. Yet their system is more than capable of keeping URLs out of the system that are listed in the robots file so there's no excuse why they, as a supposed security advocate, shouldn't honor robots.txt instructions.

    But you keep mentioning security in connection with robots.txt. If you acknowledge that it's not a security measure, and Google doesn't say it's a security measure, why are you still talking about security?

    Also, why is this a big deal? I'm not trying to defend Google, I just really don't see why it's important. Can you explain it?

    link to this | view in thread ]

  61. icon
    alanbleiweiss (profile), 24 May 2013 @ 1:52pm

    Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:

    Because its an opportunity for Google to help improve the securing of private information on the web. Since they already take proactive steps in other areas to improve security online, why not here?

    For example - they proactively block sites their system detects that have malware or viruses. They don't have to. Its the responsibility of site owners to ensure their sites don't have malware or viruses baked in. Yet Google has chosen to help.

    This is no different.

    link to this | view in thread ]

  62. icon
    nasch (profile), 24 May 2013 @ 3:23pm

    Re: Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:

    I don't get it. Earlier you agreed that robots.txt is not a security measure. Now you're saying that Google should help with security related to robots.txt. It cant be both!

    link to this | view in thread ]

  63. icon
    alanbleiweiss (profile), 24 May 2013 @ 4:12pm

    crticial thinking

    it sure as hell can be both. While robots.txt is not by original nature related to search engines, a means of security, Google has the power and resources to respect it for the sake of security. If you don't grasp that, not my problem.

    link to this | view in thread ]

  64. icon
    aldestrawk (profile), 24 May 2013 @ 4:22pm

    Re: Re: Re:

    The use of robots.txt is in no way a security measure. It was never intended to be and definitely should not be used as such. It is simply intended to relieve servers of unnecessary traffic as a result of spiders actions. Any script kiddie can do the same thing as a spider and intentionally ignore the request that robots.txt files represent.

    link to this | view in thread ]

  65. icon
    alanbleiweiss (profile), 24 May 2013 @ 4:32pm

    it's called innovation

    Just because something did not have an original intent to be used in a certain way does not mean it should not be used in a new way if that way is innovative and provides value to the world.

    Anything other than that understanding is called myopic thinking.

    link to this | view in thread ]

  66. icon
    nasch (profile), 24 May 2013 @ 7:18pm

    Re: crticial thinking

    While robots.txt is not by original nature related to search engines, a means of security, Google has the power and resources to respect it for the sake of security.

    So you're saying it was never intended as a security measure, it is not appropriate to rely on it for security, Google does not recommend it be so used, and Google should try to make it as effective a security tool as possible? No wonder I was confused. :-)

    link to this | view in thread ]

  67. icon
    nasch (profile), 24 May 2013 @ 7:20pm

    Re: it's called innovation

    Just because something did not have an original intent to be used in a certain way does not mean it should not be used in a new way if that way is innovative and provides value to the world.

    Yes, but if using it in that way is stupid and ineffective, then that doesn't provide value. Maybe the illusion of value, which is even worse than nothing.

    link to this | view in thread ]

  68. icon
    aldestrawk (profile), 24 May 2013 @ 9:00pm

    Re: Re: Re:

    Wally has pointed out several issues with how Google operates. They are somewhat related but not closely enough for me to figure out what point he is trying to make.

    -the use of spiders is the first step to setting up a search index of the web. It is the bottom layer here. Necessary, but any kind of page rank algorithm (relevancy?) takes this basic information and tweaks it in their own way. On its own, an index of the web does not determine page rank. This was Google's innovation back in 1998.

    -There is a basic issue with user privacy vs Google's business model of using search data as a basis to extract advertising dollars. I have not recently followed their data retention practices, so I accept his statement of 9 months limit or lack of it. However, what has this got to do with the story at hand here about TerraCom's security/privacy issue?

    -Someone else (the user in the industry?) thinks that Google is being lax in security because they list the URLs for which the robots.txt file is asking not to be followed. This is not any kind of security failing. I assume that Google's spider is not recursively following the hyperlink that URL represents or executing any code to produce a dynamic web page for that URL. That is the real intention of robots.txt. I think it is a minor issue that Google now lists the URL that begins a blocked branch. As the robots.txt file can be simply ignored, any real attempt to restrict access to the data on that page should require an authentication/authorization step.
    Google's security interest in labeling certain sites as dangerous is completely separate from any concern about indexing pages that the owner would rather have private. Such dangerous sites are identified, as best as possible, to contain malware such as a cross site scripting vulnerability.

    In Wally's final sentence I don't see the connection between the technical ability of Google to view the content of email on the gmail domain with the security failings of TerraCom/Vcare. A series of statements with varying validity that do not have any obvious connection is, to me at least, the definition of rambling.

    Maybe I am being too critical. After all this is just a forum where people, perhaps with limited time, just throw out thoughts to be consumed and either ridiculed or praised. I am loathe to trot out my credentials, but I have worked on network protocols for thirty years and network/computer security for 6 years.

    link to this | view in thread ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.