FCC Releases All Net Neutrality Comments As Giant XML Files For Data Analysis
from the open-records dept
While it had trouble keeping its site up during times of intense commenting, the FCC's IT team is now working to make all the submitted comments on its "open internet" net neutrality proposals available to download in a bunch of XML files:Because of the sheer number of comments and the great public interest in what they say, Chairman Wheeler has asked the FCC IT team to make the comments available to the public today in a series of six XML files, totaling over 1.4 GB of data – approximately two and half times the amount of plain-text data embodied in the Encyclopedia Britannica. The release of the comments as Open Data in this machine-readable format will allow researchers, journalists and others to analyze and create visualizations of the data so that the public and the FCC can discuss and learn from the comments we’ve received. Our hope is that these analyses will contribute to an even more informed and useful reply comment period, which ends on September 10. We will make available additional XML files covering reply comments after that date.While the more cynical among you may see this as more of a statement on the rather weak capabilities of the current FCC's system for handling searching through the submitted comments, it's still nice to see at least a move towards openness and transparency in sharing this data for others to search through. As we've noted, we've been digging into some of the data on the comments, and hopefully this will make the process much easier.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: comments, fcc, net neutrality, nprm, open data, open internet, xml
Reader Comments
Subscribe: RSS
View by: Time | Thread
Where's the...
[ link to this | view in chronology ]
Re: Where's the...
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
That's not accurate. The rules only apply to works produced *by government employees*. That is not the case here. Any copyrights remain with the creators of the content.
[ link to this | view in chronology ]
EULA?
I'm guessing it was probably something like "Grant the FCC a perpetual license to copy and display this comment." However, I don't know for sure. I figure you all would have picked up on it if there was a copyright assignment clause, though.
I hope none of the people analyzing this data work for places with legal departments who don't understand fair use. Especially since, that's the only way that anyone reporting on this can directly quote any of the comments.
Another little aside is that while it might be legal to download these huge files to your personal PC, it's almost certainly illegal to give a copy of them to anyone else. They have to go use up FCC bandwidth by obtaining the files from a "authorized source".
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
>:P
[ link to this | view in chronology ]
[ link to this | view in chronology ]
A million monkeys typing on a million keyboards...
I wonder why they compare the size of XML marked up data to raw text? I suspect the submission data would drop up to half of its size if displayed as raw text. Also, boo to offering XML files for download without compression! They serve their web pages gzip-encoded (compressed), why not their XML files? Or at least pre-compressed versions... the 5th file zips from 100M down to 5.6M, and 7zip takes it down to 3.5M! Anyway, I guess I wouldn't expect the FCC to know anything about the internet by now.
Also, I like this from their release:
While you're helping us to do our work, would you mind conforming to the standards that we have to adhere to?
Joking aside, I'm glad they're making this available to everyone. Even if we were to trust that they know exactly what they're doing, if they were doing all the work themselves they'd only analyse the submissions in whatever ways they can both imagine and implement in time. Giving this to the internet lets them benefit from novel ways of parsing the data that they might not have thought about, and it puts the onus on everyone else to try to justify whatever they claim to find in the data.
If the FCC was to make all of the findings itself without making the data public, anyone who disagreed with the result would simply say that the FCC was selectively parsing the results. Now, whenever anyone tries to make any claims from the data, everyone else will be able to verify those claims... and if someone tries to make a claim without saying how they came to that conclusion, then that will be worth about as much as 1.5Gb of uncompressed XML text.
[ link to this | view in chronology ]
Re: A million monkeys typing on a million keyboards...
[ link to this | view in chronology ]
This means separation by city and state. It also means a possibility of gross inaccuracies regarding the data. Geez, this could get bad.
Oops, my cynicism is showing itself again. I suppose I could take the data on face value. Though, it'll be difficult to determine a margin of error without the IP address.
Don't take this as a fact, because I've not counted the responses in the actual files yet, but cursory scans seems to have a majority favoring classification as common carrier.
In addition to this, the commentary also seems to be sparse, as though people simply voted without leaving comment.
Once I do this, I'll sit back and wait for others to post the results so I can determine who's lying and who's honest.
The FCC did good by releasing these files.
[ link to this | view in chronology ]
Re: IP addy
All the comments in favor came from 8.8.8.8.
[ link to this | view in chronology ]
Obama is slipping
[ link to this | view in chronology ]
Re: Obama is slipping
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
My guess
Then the FCC will totally ignore the will of the people and implement "fast" (and by default, slow) lanes anyway.
[ link to this | view in chronology ]
Downloaded it...
service providers (ISPs) treat all data equally. As an Internet user, net neutrality is vitally important to me. ..." text. So someones campaign appeared to work!
[ link to this | view in chronology ]
Re: Downloaded it...
[ link to this | view in chronology ]
I just came from trying to make a comment on the FCC comment site. Despite it being made for laywers and law firms and not regular people, the site was still unable to take my comment saying that it "could not add the text to the file" and, after turning it into a PDF and submitting it through their Expert submission "disk quota is full". No matter how big this XML file is, it doesn't represent half the comments people want to share about net neutrality.
[ link to this | view in chronology ]
Emailed comments too?
[ link to this | view in chronology ]