Will Digital Archiving Difficulties Wipe Out Important Elements Of Our History?
from the it's-a-challenge dept
Over the years, we've had quite a few posts about the risks of data extinction. That is, as more and more of our important data goes digital, there's a bigger risk that it could disappear. At the very least, it's easy for digitally stored data to become corrupted. Even if there are backups, it's possible that multiple copies could become corrupted. A bigger concern, though, is in the applications necessary to read the data. Even if you can store the data perfectly forever, without the right applications, it's meaningless. Matt Sullivan writes in with yet another article on the topic, this time from Popular Mechanics, that suggests we could be facing a "digital ice age" as plenty of data from this era of history are lost to bad archiving capabilities.Of course, there are some people working on solutions. A few years ago, we wrote about Dan Bricklin's idea that we need "social infrastructure software" that is designed to last for many years to deal with exactly this issue. Of course, that only works if such software exists and people use it. The Popular Mechanics article notes that the National Archives is working on a big system to deal with just this issue -- though, when we last wrote about the system, it sounded full of potential problems, and reading the latest details are not that reassuring. Basically, they're spending over $300 million to have Lockheed Martin build a system that will translate more than 4500 different document types into flexible formats, like XML. However, it seems quite likely that important data or metadata is likely to get lost in the process. Others are suggesting that such a plan is dangerous, and they'd be much better off focusing on emulation techniques -- but again, that seems to get awfully cumbersome awfully fast, and that doesn't even touch on the copyright issues associated with such a project. In the meantime, some are arguing that the entire problem of data extinction is overblown -- saying that important data gets updated as systems change, and there will always be some way to go back and get other data if necessary.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Reader Comments
Subscribe: RSS
View by: Time | Thread
Bosses
[ link to this | view in chronology ]
"...there will always be some way..."?
I tell ya, the Babylonians were on to something with those clay tablets.
[ link to this | view in chronology ]
easy way
Like make everything PDF and then leave detailed documentation on how Acrobat works.
[ link to this | view in chronology ]
Re: easy way
[ link to this | view in chronology ]
PDF
[ link to this | view in chronology ]
It's only a matter of desire.
machine could be built to read any format. Only
the desire to spend the necessary funds is needed.
And there's always eBAY...
[ link to this | view in chronology ]
another potential problem
www.thatpoliticalblog.com
[ link to this | view in chronology ]
The answer are CDs
[ link to this | view in chronology ]
Much ado about NOTHING
For starters, CD-ROM is barely accepted as an archival format by archivists. Second, the same plans that go into three-9 business continuity are often the same kinds of plans that are put into place for archiving. Rotate formats, archival hard copies, redundant digital copies stored far apart, etc, etc.
But really, this all boils down to doing a good job. The same person that would do a bad job archiving their digital documents would probably do an equally bad job archiving their "rlspc" documents. Either way, documents are just not safe in their hands.
And how many of us have ever experienced irretrievable digital documents because the application no longer existed? Data corruption I can see for not being able to open a particular file, but you did your job badly if the information in that file was lost because you did not back it up properly.
[ link to this | view in chronology ]
Another license monopoly
It is only a matter of time until we move to a subscripton service for software and then we can revert to older versions as part of our monthly payment. The same probably holds true for storing your data... I would pay a nominal fee to guarantee my data is backed up frequently. The problem is still how to back up all your data since the Microsoft tools to search and copy all data on a computer is still useless in my experience. It probably will not be long until Microsoft tells me that I don't own the copyright to my data because my certificates are not valid or some other BS they are trying to force down our throat to get vendors to pay them to be their protectors. It is like hiring the wolf to watch over the sheep.
[ link to this | view in chronology ]
Re: Another license monopoly
In case you were not aware, PDF has become an ISO standard. There is even archival-specific version of PDF (1.4, ISO 19005-1) that specifically disables particular features that just MIGHT not work in future applications.
And if you felt that Adobe had you grabbing ankles, perhaps you should run a search for other PDF utilities.
[ link to this | view in chronology ]
Re: Re: Another license monopoly
[ link to this | view in chronology ]
look at previous civilizations
the moores' law style of the advance of storage technology *should* mean that we can cheaply store multiple copies of everything... and yet artificial software death, closed formats, copyrights, DRM, and the like pretty much guarantee that our digital history will be lost to the next century, and possibly even to the next generation.
even if you could maintain all that was written, it is impossible to archive the semantics of what was written. you cannot archive the context of a work, it's true meaning. look at the constitution, even though it was written in english, the language used is vastly different than what we use today... and it is interpreted differently than it was interpreted when it was written... and the document is merely two centuries old. compare that to the bible, which is far older. look at how it was interpreted by the romans, then by the puritains, and compare that to the way it is interpreted today.
[ link to this | view in chronology ]
Too much ado about Adobe
And I gotta agree with Rico @ 7. An exponental growth in data only compounds the problem. One aspect we should be focusing on is what data isn't important enough to keep. We can't just keep it all forever.
My solution: RAID 0+1's for all!
[ link to this | view in chronology ]
[ link to this | view in chronology ]
It could happen
www.tilana.com
[ link to this | view in chronology ]
It could happen...
*corrected link www.tilana.com
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re: Too much ado about Adobe
[ link to this | view in chronology ]
SSDD
Thank god for Firefox spell check.
[ link to this | view in chronology ]
Someone mentioned it above. The stuff that's actually important will survive. A lot of the little stuff will die. This is not a worse situation than a thousand years ago. The important stuff always survives.
[ link to this | view in chronology ]