Marking Up a Tag Cloud

25 Comments

Comments are ordered by helpfulness, as indicated by you. Help us pick out the gems and discourage asshattery by voting on notable comments.

Got something to add? You can leave a comment below.

  1. Arthus Erea

    I love the idea of using this semantic markup and I probably will use much of this tutorial in my next project. However, the only problem is that the font sizes aren’t as scalable as they could be. Tags are only 1 of 6 sizes, so it cannot scale easily. For instance, there may be 1 tag with 5,000 photos under it, and then another tag with 1 photo in it. Using the semantic method of class names, this might only render a small difference between the very differently weighted tags. I am trying to think of a way of doing both a scalable and semantically valid task list.

  2. brothercake

    A couple of thoughts – the information you’re providing in the additional text is very verbose; a reader would have to listen to “... photos are tagged with …” over and over again. You could just as well make it a single number in brackets after the tag: “austion (344)”

    But, why make different information available to screenreaders than is available to sighted users. If it’s interesting information, why not make it available to all users? But if it’s to provide equivalence, mightn’t it be better to provide the same information as the class, like “ultra popular”?

  3. Kevin

    Great example. Thanks for the descriptions and semantics. I’ve always struggled with deciding how to mst accurately add semantics to my web pages. Now if only those double spaces from Flickr appeared in the final example…

  4. Jens Nedal

    Don’t they have people at any of the social bookmarking sites that actually understand CSS? Horrible code truly.
    All this tagging display business screams for a list with classes, just like displayed here. Thanks for putting up a good example!

  5. wrtlprnft

    Another idea for the numbers: What’s wrong with putting them into title attributes? That would make the numbers accessible to visual renderers as well, but not clutter the display.

  6. Ed Eliot

    wrtlprnft – Screenreaders are inconsistent in reading the value of the title attribute. With default configuration I’m pretty sure some don’t. I think Brothercake has a point about the verbosity of the text. I think I’d go for text equivalent to the class name values in brackets after the tags.

  7. Mark Norman Francis

    “Tags are only 1 of 6 sizes, so it cannot scale easily.” Arthus – that’s true in my example. Many tag clouds only have certain pre-set levels. Technorati’s is something of an exception, which is why you can find quite so many nested EMs. However, there’s nothing to stop people defining more steps and therefore more classes.

    “the information you’re providing in the additional text is very verbose” Brothercake – that’s very true, and for a cloud with a lot of links that would be a valid concern.

    “But, why make different information available to screenreaders than is available to sighted users.” Brothercake – I do believe that information should be available, and not just using CSS. My point here was not to show the ultimate tag cloud; rather, just that existing tag clouds could be made better and more semantic without altering how they are displayed to the user with CSS. Personally, I’m not that a big fan of tag clouds. But I am a fan of semantic markup. :)

  8. Ben Ward

    Interesting stuff, certainly; there is without a doubt some utterly grotesque mark-up out there.

    A couple of points:

    Firstly, your aside about using rel=“tag” on links isn’t right. rel-tag has a more specific purpose than just any link to a tag page:

    By adding rel=“tag” to a hyperlink, a page indicates that the destination of that hyperlink is an author-designated “tag” (or keyword/subject) for the current page [emphasis mine]

    Therefore the list of tags following a blog post should be marked up with rel=“tag”, as they link to the page that aggregates all content for that tag. But a tag cloud is a summary of those tags; you’re not ‘tagging’ the current page, so those links shouldn’t have rel=“tag”. They’re just regular hyperlinks.

    As for tag cloud mark-up, I agree completely with the analysis of existing mark-up. Technorati tried to be semantic but lost the plot a bit when they reached such deep nesting. That said, I’m not sure they’re completely wrong for using EM.

    I think there are two distinct lines of thought for this. The first is that HTML does not have sufficient means to describe this cloud representation of tags and that therefore classes should be used on top of generic mark-up. The second is that HTML does not have sufficient means so we should use the closest matching mark-up we can find.

    The nearest-match for tag clouds is EM and STRONG. In whatever context, the use of text size in a tag cloud is to emphasise one tag over another. The problem is that that only provides three levels of emphasis (none, EM, STRONG). For the project I’m working on at the moment I’m taking the ‘nearest match mark-up’ approach and have contrived a forth level with EM nested in STRONG.

    Is nearest-match the way to go? I’m not sure. There are times when using nearest-match mark-up risks devaluing the semantics of the elements used (not dissimilar to TABLE being devalued for all its misuse in layout; could browser makers have done something amazingly cool with data tables had they only been used properly?). My feeling is that in the case of EM and STRONG in a tag cloud the devaluation is nil and so for me — in a situation where four size levels is sufficient in my cloud — I’d rather take some HTML semantics.

  9. Scott Reynen

    I llike Technorati’s use of HTML tags rather than class names with essentially the same meaning, but I agree it’s a bit verbose to be using all those ems. So I’m mixing ems and strongs in my own tag cloud markup, e.g.:

    http://typewriting.org/tag/
    http://typewriting.org/link/tag/

    Edit: Looks like we lost the tags - Scott shoot me an email and I'll put them back in. Drew.

  10. Bryce

    It had never dawned on me that tag clouds needed to be anything more than just links. You make a perfect point with the accessibility issue. Brilliant approach!

  11. Mike Stenhouse

    I mark mine up with an ol and bracketed counts – it was the best way I could think of to get all the information in one place. I did try ‘a’, ‘a em’, ‘a strong’ and ‘a strong em’ but I figure that the spoken-out-loud distinction is just too subtle and obtuse to be useful. I’m open to persuasion on that one though.

    Incidentally, it’s worth remembering that a ‘tag cloud’ is actually a weighted list of tags. Weighted lists are a great example of information design: managing to convey lots of information in a very compact way. They are useful in contexts outside of tags too…

  12. Rob Ellis

    I absolutely love this post, it’s great.
    I too feel that there should be a little more control in the size of the text, perhaps a JavaScript solution that pulls the number of posts from an inner span and dynamically assigns a weight based on some thresholds, min size, max size etc.
    The offset text is great for SEO.

    Again well done, perhaps when I have a bit more time I will put some code together.

    Rob Ellis

  13. John Allsopp

    I did a bit of work on this issue for a possible microformat a couple of months back.

    http://microformatique.com/?page_id=34

    It’s a reasonably complex but I think interesting issue. It demonstrates how superficial a lot of HTML based development is – little or any real attention paid to the underlying semantics (with a couple of honorable exceptions) as noted.

    john

  14. /T

    Another really serious accessibility issue with your tag cloud is that flickr collapses multiple words into one. thereflectionofthehairofchristianheilmann is 42 letters and will not fit onto one line of a standard braille display with 40 modules.

  15. Andy Hawkes

    I do like the idea of the semantic markup used, but I can’t help but think that the ordering of the list items should be by tag weight rather than alphabetical.

    The entire concept of your method is to make a tag cloud work for users who cannot see the font size variance whilst applying a solid semantic structure, yet you have semantically ordered the data by a totally different measure for those users than you have for those who see the tag cloud as intended.

    I get that the content provides the context in terms of the “46 images tagged with” text, but it’s the use of the ordered list that is creating that dull nagging feeling in the back of my head (or maybe it’s down to the fact that we had our office christmas party last night…) as the ordering is clearly different for the two audiences.

    Other than that, it’s a very simple approach.

  16. Chris Messina

    When John Allsopp was working on his tagcloud effort, I started documenting some of the syntaxes you had done (though didn’t get very far). I did, however, come up with my own proposals.

    Essentially it occurred to me that we’re working with two constraints (among others): (1) to be semantically meaningful in the source code such that it’s machine parseable but not human-offensive and (2) to reflect the relative tag weight in the visual, “graph” display (after all, a tag “cloud” is really just a visualization of data, like a pie chart).

    In order to do this without being verbose in markup, while maintaining semantics, and while also allowing for default renderings to make sense in graph sense (think about a tagcloud on a cell phone browser — i.e. without styles applied), I realized that all this nonsense that we’re really looking at an ordered list and that presentation is not necessary is completely false. Indeed it is the rendering of the data that makes tagclouds useful — as a summary of the relationships between multiple items in a set. Additionally, to presume that tagclouds only refer to popularity (as opposed to prevalence or frequency) is also a terrible mistake: piecharts don’t only refer to how much apple strudel one can eat!

    Anyway, I choose a path that would be both semantically accurate, somewhat condensed and universally accessible. It can be improved, surely, but I think the combination of em, strong, big and small tags can be rather flexible at expressing thie graphed data.

    In any case, I’d love to see folks revisit their proposals given the constraints I’ve discussed above.

  17. Nicolas Hoizey

    I did also think a lot about these accessible tag clouds a while ago, in french for those who can read it: http://www.gasteroprod.com/comment-faire-un-tag-cloud-nuage-de-tags-ou-d-etiquettes-accessible.html

    Like Ben Ward, I also chose to use EM and STRONG elements, but without nesting them. I feel it to be as improper as nesting EM with another EM.

    As noted by Andy Hawkes, I am really not satisfied with an alphabeticaly ordered list, especially expressed as an accessibility improvement, because this is popularity that should be used as the ordering criterium. Having not yet figured out a simple JavaScript that could take a popularity ordered list and produce an alphabeticaly ordered one only for “visual” user agents, I still propose both lists on my page dedicated to tags: http://www.gasteroprod.com/tags/

    I think there is a need (and I would be really happy to contribute) for a microformat for tagclouds, starting with John Allsopp’s work (also available here http://microformats.org/wiki/tagcloud-brainstorming) and all your comments!

  18. ampz

    I think it better to use title attribute in “a” tag instead of off-left technic. Your code will look better, and not only screen reader can know that information (in off-left span) also.

  19. Kriss

    I’d suggest using “semantic css” to hide the information irrelevant to visual browsers, but leave it accessible to aural browsers:

    @media screen {
    .tag-cloud SPAN {display:none;}
    }

  20. Ben Palmer

    Great discussion on how to properly present tag clouds. To be honest the post and comments has got me thinking about what, semantically, is the best result.

    I feel that nested em’s may still be a viable solution but looks very messy.

    Would Googlebot also look at something nested in 3 em’s with more focus than something within one? We’ll never know…

  21. John Smith

    The guy who wrotes the article is right in the first part. But the conclusio and his recommendations are – from my point of view – wrong.

    First:

    - From my point of view three different levels are enough for tag clouds – For three levels the weightings: normal, emphased, strong are prefect from the semantic perspective – Using css classes to mark the importance is bad – even if their names are semantic because without css all levels are the same

    So ask yourself: Do your really need more than 3 levels for something boring like a tag cloud? If yes, you have to deal with classes or ugly nested tag combinations to get something which is closest possible to semnatic stuff.

    In all other situations you might prefer a true semantic solution based on an ordered list and the three levels normal, emphased, strong.

    Have a nice day.

    John

  22. Mark Stewart

    Font sizing ems (relative) requires a nominal static base font size. This tames browser markup and keeps your cloud inside it’s expected page display space. There may be browser idiosyncracy involved in body vs. cloud tag application of the static font size. I always stick base font size in the html body attributes string.

    A good idea already commented is to label your size styles with numbers, even the numerical values that size your type in say ems. This might help also if your cloud moves from a presentational hack to a local web real-time page access display.

    Now, if your base font size is 16px then 0.5em through 1.5em (or whatever increment you choose) can represent a 1-10 scale of cloud tag appreciation. Also, you can now shape your cloud size (line-height, etc) with greater accuracy.

Impress us

Be friendly / use Textile