Ghosts On The Internet

14 Comments

Comments are ordered by helpfulness, as indicated by you. Help us pick out the gems and discourage asshattery by voting on notable comments.

Got something to add? You can leave a comment below.

  1. AP

    Great post. I often have a hard time finding a date stamp on a lot of blog posts, which is a really strange phenomenon to me. They should be a) present, and b) obvious.

    Thanks!

  2. John Faulds

    Interesting article. I’ve just updated my WP site based on the hatom example. For anyone else wanting to do the same, and using the Microformats recommended date/time format, you’ll want to do this:

    <abbr class=“published” title=”<?php the_time(‘Y-m-d\TH:i:sO’) ?>”>…

  3. Douglas Greenshields

    An important subject, and one the use of microformats like hAtom will only barely scratch the surface of (though scratch we must!). It’s sobering to realise that next to everything we emanate to the web will be available, before very long, to everyone forever. If you’re really considering the future, and your extremely privileged position being alive at the time of the first few microseconds of the web, every form of “content” (I’m trying to use that word in the best sense possible) should really have some kind of timestamp against it, even if it’s of a more appropriate level of specificity (for example, when’s the last time you read a book that gave its publication date to the nearest second?).

    Search engines need to pull their weight too – it must be possible to search the web along the time axis. I regularly search technical issues and find I turn up mostly pages prior to 2004, which are mostly useless – true, I can often tell the age of the page by its lack of adherence to any kind of less-is-more principle – and it’s interesting to note that we will tend to mark the coming decades using web design idioms – but there’s a difference between the sum of human knowledge now and the sum of human knowledge forever to infinity and beyond, and search on the web needs to start explicitly recognising that!

  4. James Aylett

    There’s a good reason for Atom’s having atom:published as optional but atom:updated as required (which is what I assume Gavin means by the somewhat opaque phrase “encourage[s] the use of last modified”): the date an entry was updated is more useful to sort on than the date it was published.

    (From what I can remember, and a quick look through the archive and wiki, dates in Atom were a complex debate that went on for months. At one point there were proposals on the table for up to five different dates associated with an entry; in the end it was decided to keep Atom slim, and allow extensions to carry the weight of further requirements.)

    While we’re here, Dublin Core has exactly the right term to cover the date of the subject of an article (eg: ‘some time in 1891’ according to the Flickr page for the photo above). Coverage is “the spatial or temporal topic of the resource”, and can be expressed as a named period, date or date range. For machine readability, ISO 8601:1988(E) is a good choice (you could use one of its profiles, such as RFC 3339, or W3CDTF as mentioned at the start of the article; 8601:1988(E) has the advantage of supporting start-end date pairs and durations as well as single instants in time).

  5. Ben

    Thankyou for the wake up call. This is something I have often neglected in my themes (and URLs for that matter), mainly due for cosmetic reasons and laziness in my theme building.

    However, this has changed as of today. University emphasised how important RDF is becoming more and more each day, and since graduating I have neglected it. Turns out I should’ve paid more attention.

    Thanks again for reminding us that the content on the Internet lasts forever. And it appears that no one is archiving it correctly.

    What a crying shame.

  6. Andi Farr

    Wow, great article. Like several other commenters, I have been extremely slack in properly integrating date and time into my web content. I’ve also spent frustrating hours trying to ascertain the age of web documents on more than one occasion, so I should know better. Occasionally you do run into documents which are either extremely important or completely irrelavent, depending on their age – with no way of telling how old they are, it’s sometimes impossible to tell which it is!

    Anyway, this has been added as item 0 on my site revamp to-do list. Much obliged for the extremely insightful look at different ways of handling this crucial information, and for all the further reading that you’ve linked to.

    Take care, Andi

  7. Roman Bercot

    I’m a bit disappointed to learn that there’s not a standalone microformat for date, but that the date pattern has to be part of a larger microformat. I think it would be useful to be able to arbitrarily mark things up with a machine-readable date.

    For instance, if I were writing a story that mentioned September 11th, it would seem handy to be able to put a span around it with a machine-readable date of 20010911. This is neither the publication nor modified date, but it is relevant to the content of the article.

    If anybody has ideas on this, I’d love to hear them.

  8. Per Wiklander

    As some of the previous commenters have said, it is frustrating to look for information, mostly technical documentation in my case, that might be relevant or not at all, depending on the date it was published. A clear time stamp would help here (preferably at the top of the page).

    What would be even better is if the contents creator would take the time to actually mark the content as outdated when new information has become available. Not seldom I read a long article on one technical subject or an other, that actually has a recent date, only to see in comment #312 that it is not correct or no longer relevant.

    I’m almost thinking of creating something like oldpages.com (I made that URL up now) which would let people submit pages with a short comment describing why the content at that URL is outdated or actually dangerous. I guess a Firefox plugin could then use this information to display a warning message on the outdated page when visited. If the service actually tried to contact the content author it would be even better.

  9. Chris

    Great article, and indeed a microformat should exist for it, but I doubt it will be used much.

    In my opinion, site owners intentionally omit the date, as to make the obsolete content more palatable to visitors referred by Google. Let’s face it, the web has a lot more old content than new.

  10. Shaun

    Thanks for using our photo as an example! I just wanted to clarify (and this just adds to your argument, I think.) The photo was collected by the Douglas County History Research Center in 1992. It was scanned 08/30/2002. I’m not sure when it was first put online. Sometime in early 2003 I suspect. The Flickr project is a fairly new one for us.

  11. blackdog

    great article, luckily enough (looking the bright side) my production is so small up to now that it will be very easy to be more compliant.
    i would correct saying this is not the last piece to add if not chronologically speaking, it should have been the first one. But it’s probably implied in the sentence :)

  12. Kris

    Very good point about dates. But this goes for everything now. There is lots of good information on the net but the 5% of junk can be dangerous if it gets passed along as fact!

  13. Douglas

    Hmm. I think if you could update content in a way as to make it expire at a given date, that could prevent the spreading of certain information that my become inaccurate over time. Not sure how to go about creating such a code though, and changing it manually sounds like quite a job.

Impress us

Be friendly / use Textile