Tales from the history of Canadian technology

Tuesday, March 20, 2007

A two-bit archive?

The CBC recently posted an article concerning the stability of digital archives, and in particular, difficulties that the New Brunswick provincial archives were having with digital storage formats and media:

"One of the problems is that [digital is] so susceptible, so vulnerable to damage," Noel said. "I've had audio tape come into the archives, for example, that had been submerged in water in floods and the tape was so swollen it went off the reel, and yet we were able to recover that. We were able to take that off and dry it out and play it back.

"If a CD had one-tenth of one per cent of the damage on one of those reels, it wouldn't play, period. The whole thing would be corrupted."


This is an oversimplification. Analog and digital storage have different characteristics which come into play. As the Slashdot crowd pointed out, the recovered tapes could not be perfect copies of the original, simply because of the nature of analog storage. Each new generation -- particularly one derived from a soaking, swollen tape -- will introduce new variations and noise. A digital copy will always be identical to the source, and error correcting codes can do quite a bit to guarantee bit-for-bit survival in the case of partial media failure.

However, the greater problem, not addressed in the CBC article, is what to do when the stored data becomes inaccessible because the data-reader has fallen into obsolescence. Certain word processors are famous for their inability to open documents created with earlier versions of the software. Magnify this over decades, even centuries, rather than years and consider if a essay written in 1995 saved on floppy disk will still be readable in 20 years, versus a book printed 500 years ago. What can be done with an unmarked punched card from the first half of the 20th century, other than begin a new life as a bookmark? A famous example of this sort of bit-rot is the 1986 Domesday laserdisc project, which was rendered useless when the technology became obsolete; there is a particular irony that original 1086 Domesday book is now online.

On an individual level, it is not difficult to make a go of it every few years and update every important electronic document to a readable format. Some formats, like email, are relatively simple and unlikely to change. Others, like MP3, are well documented. But on a grand scale, say that of a province or a nation, digital archival work must be a sheer nightmare of logistics. I'm not an archivist or curator, but one solution I heard was to print every digital artifact to paper for permanent storage.

Labels: ,