How Do We Protect Our Growing Base of Information for All Time?
Is it ever ok to let information die?
In the past, a group’s stories, legends, history, and religious beliefs were passed from generation to generation by word of mouth or documented in drawings or written languages of the day.
That’s because from the earliest times, the human race has been prodded by an unwritten mandate to pass the rapidly growing base of knowledge and information from one generation to the next.
Ancient Egyptians, for example, expanded upon the writing systems of previous cultures and developed the hieroglyphic writing system. It evolved over many centuries but it was still in use as recently as 400 AD. That’s when something better – the Greek alphabet – came along and was embraced by the intelligentsia.
Within 100 years hieroglyphic symbols were no longer decipherable. The messages in those beautifully painted and carved symbols were locked away, seemingly forever, until the Rosetta Stone was uncovered in 1799. Fortunately, the messaging in that stone was written in two languages, Egyptian and Greek, allowing for translation of the Egyptian hieroglyphic portion and providing the key to translate other hieroglyphic texts.
That was a close call – a language and a valuable historical accounting of the world that was almost lost forever. Could that happen again? Could a writing or a data storage system become not only obsolete, but completely undecipherable over the course of five to ten generations? After all, new and better technology is always on the horizon, like the Greek alphabet and cloud storage.
If you think that can never happen, think again. It’s happening all the time.
We’re losing languages left and right. As many as 90% of the world’s languages could be gone by 2100 by some estimates. And along with them, goes valuable insights into past cultures and communities, all critical pieces for understanding the human race as it exists today.
If we look at the bigger picture, any circa 1820 book that has survived fire, flood and careless destruction is readable today. Sure, life was simpler in 1820 and books were the primary repository of information. We still have the ability to read the information in those books.
But, think about all that’s happened since then, though, beginning with phonograph records developed later that same century to capture and store sound. Now imagine an archeologist 200 years in the future unearthing a floppy disk, cassette tape, or video on a CD. Will any of that data be recoverable?
Even today, you’d be hard-pressed to find a computer with a drive for a floppy disk. In fact, it’s doubtful that any of today’s reader devices will still exist even 50 years from now, let alone in 200 years.
One of the ideas behind cloud computing was to free users from the ever-changing nature of storage devices by using remote storage in server farms far, far away. Is our data safe and recoverable in the cloud for all eternity? Keep in mind, data centers require ongoing power and ongoing attention. What if something goes wrong? Even redundant systems aren’t foolproof.
And regardless of the technology, should we be concerned that the vast majority of “humanity’s data” lies in the hands of private corporations? Companies like Google, Amazon, Yahoo, Apple, IBM, and Microsoft have staked their future on the value of the information archived in their data centers.
However, if one of these corporations ceases to exist, what’s the protocol for dealing with all the information residing on their servers? Thinking long-term, and knowing that only a very small percentage of companies survive long enough to celebrate their 50th anniversary, let alone their 100th or 200th, how much of this information will still exist 200 years from now?
Is it the government’s job to preserve information and keep it available to the public? In fact, the National Archives and Records Administration has the remarkable mission to preserve certain materials that are “important to the workings of Government, have long-term research worth, or provide information of value to citizens.” NARA is doing a great job, no doubt. But who decides what classic book, audio recording, or blueprint is included? Who’s responsible for humankind’s mandate to protect and pass on the terabytes of information that don’t make NARA’s cut?
Maybe that’s the bigger question we should consider … Not whether any information will ever be irretrievable, but should all information be stored for all time? And if so, who gets to decide?
What we need is a long-term strategy for data preservation, and by long term I mean 1,000 years or more. And at the very least let’s make sure NARA is saving instructions and documentation for every data storage device as it comes along, including the floppy drive and digital tape player.