Monday, January 26, 2009

Smoke and Mirrors...

The web is a linked world. That's what it was designed to be, and it works.

But, sometimes that can be a problem.

Basically, all information on the web is ephemeral. If you look at a page today, you pull it from the server. If you look at the same page tomorrow, you pull it from the server again. The information that was here today can be gone, like a puff of smoke, tomorrow.

For lots of things, this doesn't matter. Who cares what was on diggs' homepage last year?

But for CNN(for example), we have completely different expectations. If I want to look up which politicians were indicited on Jan 21, 2009 (I can dream, can't I?), but CNN pulled the page after 1 month, how do I look back into history?

Even worse, the Orwellian prospect that history has been changed. Has a name been dropped from the list, or has one been added? A publisher can't recall and change all his "dead tree editions" overnight, but a minute with a keyboard and everyone's history has changed. What, this isn't what was there yesterday? Prove it... (this is why groklaw keeps an archive of all legal documents related to the SCO vs IBM and related cases.)

Sorry, veered off into politics. I must watch myself.

Getting back to my last post, the linked nature of the web's information can be it's weak point. If something happened to me, what would become of (for example) DeSmet C?

In a month or so, my provider would notice that I've stopped paying my bill, and cut off my service. The webserver would no longer be online, and the pages would vanish.

Could the internet archive carry on? To an extent -- it can't save everything, and has a size limit on what it does copy. The files aren't there, only the text.

There are links to the site and it's content, but everyone assumes that the information will be there (effectively) forever. I'm guilty of it, too...

What needs to be done is to not just link, but mirror. Wget is your friend. Disk is cheap.

Has an interesting / worthy site gone away? It doesn't have to vanish forever.

No comments: