Verizon Is Undermining Efforts To Archive Yahoo Groups...For No Coherent Reason

from the ill-communication dept

Verizon's often sad efforts to pivot from curmudgeonly old telco to sexy new Millennial advertising giant have not gone as the company had hoped. From the failure of its Go90 streaming service to its clumsy effort to turn AOL and Yahoo into a Facebook-killing ad empire, Verizon often can't get out of Verizon's way. The "consumer comes last" executive mindset of the government-pampered telecom monopoly is frequently reflected by its policies, like Verizon's decision to acquire Tumblr, ban one of the most compelling aspects of the service (adult content and art), then turn around and sell it at a massive loss.

When archivists attempted to try and preserve a lot of the adult-themed art that Verizon was jettisoning, Verizon responded by banning archivist IP ranges for no coherent reason. Much like Facebook, Verizon positively adores looking at a controversial situation, then coming up with the worst possible policy and PR response. You know, like that time they hired a fake journalist to pretend the company wasn't trying to kill net neutrality.

Another case in point. Back in October, Verizon and Yahoo informed users of Yahoo Groups that the 20 year community would be shut down coming this December 14. Archivists set about trying to catalog and store the decades of conversations, images, and content on the platform. But Verizon being Verizon, those archivists now say the company is actively undermining their efforts, including banning Archive Team email addresses being used to archive content, and actively blocking tools used for the same purpose:

"Yahoo banned all the email addresses that the Archive Team volunteers had been using to join Yahoo Groups in order to download data. Verizon has also made it impossible for the Archive Team to continue using semi-automated scripts to join Yahoo Groups – which means each group must be re-joined one by one, an impossible task (redo the work of the past 4 weeks over the next 10 days).

On top of that, something Yahoo did has killed the last third party tool that users and owners have been using to access their messages, photos and files. (PGOlffine).. Note: not everyone who paid for the PGOffline license is being impacted by the problem. but the developer does not have a workaround."

Under Section 230 Verizon faces no liability for the content shared on the platform, and there's no valid reason for them to be fighting back against archival efforts. Yet here we are. Verizon didn't respond to several requests for comment, so it's hard to understand what the telco is thinking, if it's thinking at all. I spoke briefly to Archive Team co-founder Jason Scott and Cory Doctorow, both of whom were than impressed by the company's tone deafness:

"What they are doing is burning 20 years of history and archives maintained by communities with a non-functioning system for backing them up," he said. “They made no real preparations for users to pull the information out because companies like Yahoo! were never designed to allow information to leave their walled gardens."

“This is 20 years of communities, discussion and artifacts from millions of groups, all representing learned information, legal and historical references, and naturally, the conversations of tens of millions of users,” Scott said. “Some of it is likely worthless and some of it is likely precious. It is all being treated like trash."

It's one thing for Verizon to shutter the platform. It's another for Verizon to actively block harmless efforts to preserve 20 years of internet history ahead of the shutdown. But being a government pampered monopoly in a largely non competitive market has left Verizon ill-prepared to actually listen to the communities it impacts (especially when there's no money to be made by doing so), a major reason Verizon's pivot from telco to new media ad darling hasn't quite gone according to plan.

Updated: After Verizon's behavior resulted in some unwanted media attention, the company has finally changed its stance. It now tells me it has extended the deadline for the Yahoo Groups shut down to Friday, January 31, 2020 at 11:59pm PST.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: archives, blocks, digital history, groups, history, yahoo groups
Companies: archive team, internet archive, verizon, yahoo


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Who Cares, 10 Dec 2019 @ 9:58am

    The reason

    Do you know how expensive it is to transfer that much data out.

    What are they going to do at Verizon if the C*O types can't have new shoes made for themselves this month?

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 10:01am

    How is already-collected data affected?

    The linked story says "The Archive Team says they’re facing a loss of nearly 80 percent of the data they’ve collected so far". What? How can Verizon's actions cause them to lose data they've already collected? I can't imagine they'd be dumb enough to store it on a Yahoo service, which suggests the statement is simply wrong.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 10 Dec 2019 @ 10:40am

      Re: How is already-collected data affected?

      It's not that 80% of the data they've collected is lost. Here's the actual quote:

      "we’ve lost the access to the vast majority of the groups we joined [because Verizon blocked access to our accounts]
      …. the effect is that some percentage…..of the signed up groups can no longer be fetched from …."

      They are working to get a final number but the Archive Team estimates that is a 80% loss of the Groups they and their volunteers spent the last month joining in preparation for archiving.

      So, the data they've managed to download, they obviously still have. What they've lost is access to 80% of the Groups that they had joined as part of their effort to download that data. Which, I suppose, one could interpret as losing access to 80% of the data that they had gained access to.

      The idea that what had been lost was 80% of the data already collected seems to stem from a misinterpretation somewhere along the line.

      link to this | view in chronology ]

      • identicon
        Anonymous Coward, 10 Dec 2019 @ 11:28am

        Re: Re: How is already-collected data affected?

        Which, I suppose, one could interpret as losing access to 80% of the data that they had gained access to.

        Which still wouldn't count as "losing data". Thanks for the explanation. That means these 80% of groups aren't going to be completely archived, unless they find a copy elsewhere or work around the bans quickly enough.

        It's an open question whether they'll be able to grab the remaining 20%—despite the extended deadline, the update doesn't say they'll stop fighting archivists.

        link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 10:12am

    Updated: After Verizon's behavior resulted in some unwanted media attention, the company has finally changed its stance. It now tells me it has extended the deadline for the Yahoo Groups shut down to Friday, January 31, 2020 at 11:59pm PST.

    That's nice, I guess. But as I read the rest of the article, an extension may not have been needed if they hadn't gone out and actively broken archiving. Even with the extension, if they don't back off on the other anti-archiving actions, it may still be impossible to save everything in time.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 10 Dec 2019 @ 10:27am

      Re:

      These were mostly public mailing lists. Yahoo could copy them all onto a hard drive or two, mail it to the Internet Archive directly, and avoid all this trouble.

      link to this | view in chronology ]

      • icon
        Federico (profile), 10 Dec 2019 @ 1:13pm

        Re: Just mail a copy

        Yes, sort of. Apparently it was 8 PB in 2011. https://news.ycombinator.com/item?id=21274484

        More realistically they could sell the content at 1 $ to some competitor. Maybe a Sympa or Discourse commercial hosting provider with enough capital to shoulder the losses for a while in return for publicity? Like SmugMug with Flickr, they could then purge the most expensive groups (porn can probably be kept elsewhere) and keep 99,9 % of the real content.

        Or yes, they could just make a giant set of mbox files for the text content of all groups and be done with it in a short while. Then people could import them into their mail client (given Yahoo has said "don't worry, the archives remain in your email!"), mailman or whatever. Did nobody exercise their GDPR art. 20 right to data portability? It says "structured, commonly used and machine-readable format", can't think of anything but mbox. Yahoo is legally obliged o produce mbox files, as I read it.

        link to this | view in chronology ]

        • identicon
          Anonymous Coward, 10 Dec 2019 @ 1:44pm

          Re: Re: Just mail a copy

          Apparently it was 8 PB in 2011.

          Wow. I underestimated. Is that mostly text? It would be more like 1000 hard drives, which is less practical.

          More realistically they could sell the content at 1 $ to some competitor.

          They've already got a group of people clamoring for the data. archive.org will host it (in some form, anyway).

          Did nobody exercise their GDPR art. 20 right to data portability? It says "structured, commonly used and machine-readable format", can't think of anything but mbox. Yahoo is legally obliged o produce mbox files, as I read it.

          It's interesting, but they'd only have to give each person their own data, right? (Emails sent by them, and maybe emails sent to them.) Perhaps if a large enough number of Europeans requested it, Yahoo would decide it's easier to dump all of the data elsewhere and reply with a form-letter showing how to find it.

          link to this | view in chronology ]

      • identicon
        Annonymouse, 11 Dec 2019 @ 12:22pm

        Re: Re:

        Why copy?
        The hardware is already an antique cybernetically speaking.
        It would be cheaper and easier to just decommission the racks and "donate" it all for a nice tax write off and not pay any disposable fees.

        link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 10:51am

    Reaction to Verizon's extension

    Here is the reaction of one of the people trying to archive these groups (TLDR: they're still not happy about it):

    The minute you make all our groups private on December 14th, you will make it completely impossible for some users to rescue their data. Because if it’s private, and the owner is gone, we won’t be able to copy and paste posts, or attempt to copy photos, etc. They will be cut off from us especially on cases where the moderator or owner have been locked out of their groups.
    […]
    If you want this to stop, Just drop the ban. Let the archivers save our groups. Stop blocking us at every turn, and give us until May 14, 2020 to accomplish it. And don’t give me that BS about Terms Of Service. Your Terms Of Service weren’t being applied to us in 2013, when Yahoo was violating them left and right and hurting people.
    You just plan to toss it all in the trash, so let us in to retrieve it. Then it will be off your hands, you won’t have to worry about it being costly to store it, and we’ll be out of your way, and it will be a win win for all of us.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 11:10am

    I thought the 'reason' being used was that Verizon couldn't afford the cost of storing the back ups! Perhaps it needs to rip customers off further with an increase in fees. Or maybe plead poverty again so as to get another load of public money and not do what they're supposed to!

    link to this | view in chronology ]

    • icon
      Norahc (profile), 10 Dec 2019 @ 2:00pm

      Re:

      This is just Verizon being Verizon...they couldn't figure out how to make money off it so they decided nobody should have access to it.

      link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 11:12am

    Yahoo has always been bad for this

    Even before Verizon became involved, Yahpoo! were the ones who bought Geocities for some ridiculously-high sum, only to shut it down and destroy everything. This has been their modus operandi for years. An apt comparison is that Yahoo is "a leaky sewerage pipe" because everything it touches becomes watered-down excrement.

    Verizon is only making things worse - with obstructing archivists and censoring content being prime examples - but Yahoo was always bad.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 10 Dec 2019 @ 11:33am

      Re: Yahoo has always been bad for this

      This has been their modus operandi for years.

      That matches Archive Team's opinion:

      Yahoo! found the way to destroy the most massive amount of history in the shortest amount of time with absolutely no recourse.
      As of January 2009, Archive Team no longer considers Yahoo a dependable location for data.
      This is not based on their engineering, which has shown itself to be consistent and with few outages. Rather, it appears the company is in relative free-fall with regards to which projects they will maintain and what comes under any given knife for cost-cutting measures.
      When a company enters this sort of spiral with regard to one of their core businesses (hosting and providing of information services), and consistently gives little or no indication of their next move, it becomes incumbent upon the users of that service to either demand changes in policy, or find alternatives, even poor ones, and build those up.

      link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 11:16am

    Another step from another corporation in blackholing what used to be cool or just an older way of doing things on the net. They don't want anyone to know what the 90s and 00s were like. Probably also a neasure of "we own it now, so no one else can have it".

    link to this | view in chronology ]

  • icon
    Mononymous Tim (profile), 10 Dec 2019 @ 12:37pm

    Maybe they're afraid they're going to hit their cap with all that data going out the pipe.

    link to this | view in chronology ]

  • identicon
    Mandie, 10 Dec 2019 @ 1:35pm

    Big deadline is still December 14

    The January 31 deadline is for people saving "their own" content - public access is still limited to December 14.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 3:56pm

    I can't put out of my head how these dicks love to get into the bedrooms of the masses and sneak porn in on coffee breaks how much hypocracy is involved here.

    link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Dec 2019 @ 4:12pm

    didn't they undermine the effort to archive the tumblr content that being removed as well?

    link to this | view in chronology ]

  • identicon
    MD, 10 Dec 2019 @ 8:07pm

    Average Customer Being Misled

    Amongst the chatter about archivists, a key fact is being overlooked. Verizon is misleading the average user. Their 'extension" is only for the customers who use Verizon's official download tool. Which does not provide any photos and in many cases, files are missing. The only way Verizon says you can download your photos is to go one by one. Except...after Dec 14, there will be no way to log in to download your photos and files one by one. That is what the third party tool PGOffline was doing (this is a tool used by customers, not the Archive Team). So anyone who listens to what Verizon says will find themselves locked out from their data on Dec 14.

    link to this | view in chronology ]

    • identicon
      Anonymous Coward, 17 Dec 2019 @ 11:08am

      Re: Average Customer Being Misled

      Our volunteer fire department just got "hosed" by this gotcha. We're reeling from the fact that all our files and photos are gone when we thought we had a month to transfer them.

      link to this | view in chronology ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.