In doing a literature review about 6 months ago, I read about research that Gordon Bell was doing at Microsoft. His team is working on MyLifeBits, a project that aims to digitize and store *all* the information one interacts with in her/his daily life. It’s supposed to be a fulfillment of Vannevar Bush’s vision from 1941! (That’s a pic of him from wikipedia).

One of Bell’s papers mentioned the challenge of not just digitizing and storing the information, but ensuring that the data could be readable in the future. As Bell states:

“The most serious impediment to a lasting archive is the evolution of media, platforms, formats, and the applications that create them. Unique, proprietary, and constantly evolving data formats such as Acrobat-4, MPEG-4, Oracle 8, Quicken 2001, Real G2, and Word 2000 suggest or even guarantee obsolescence.”

After reading the paper from 2001 I was fairly certain that I could find files on my hard drive from 2001 or earlier, that I could no longer open because of format & application version compatibility issues. And that was only 5 years ago, what about in 50 years?

Vint Cerf (of Google) also mentioned the problem of orphaned data as a result of filetype/application evolution when he spoke at the University of Toronto in late 2006.

Open formats to the rescue?
Jonathan uses a great example to make the case for ODF (which I know Bob Sutor and many others have been making for quite some time). The plug-in that Jonathan mentions sounds like a great idea; and maybe a reason for this Microsoft funded project to shut down :-)

Open formats can minimize the likelihood of orphaned data. But, your application of choice needs to implement the open format before you can open that 50 year old file (50 yrs from now). If your application of choice is an open source application, and just one other technically inclined person has the need to open a file of the same filetype, you should be in luck.