Incite A Riot

36 Comments

Comments are ordered by helpfulness, as indicated by you. Help us pick out the gems and discourage asshattery by voting on notable comments.

Got something to add? You can leave a comment below.

  1. Mark Stickley

    I’m all for that – good call!

    What about simultaneously campaigning for the reinstatement of the ‘dialog’ element, though (given that a ‘dialogue’ element is a lost cause)? That way you could have semantically meaningful conversations AND assistive tech wouldn’t read out a line number for each speaker.

  2. Yep, i think you’re right. I don’t understand why it’s not used the traditional definition for cite (contains a citation or a reference to other sources) that does not contain reference to its representation.

    Anyway backwards compatible should be the norm.

  3. First!

    I mean…

    +1!

    OK, let me try to put something more meaningfull here.
    I like the ol/li/cite/q example. It’s semantics put into good use, and that should be more than enough for marking up a dialog and do crazy styling stuff.

    Just one minor detail:
    In the dl/dt/dd example, the colons after the speaker name are “floating” in the void (well, the dl, to be fair). They are outside both dt and dd. That let us with another +1 for the ol/li/cite/q example, as the colon, at least, is not “floating” in the void.

  4. Ben Matewe

    I agree civil disobedience is the way forward. Proposing changes for theoretical reasons alone is not practical. As long as what you are doing is practical and considers users over the other parties then its good for me. I always try and remember who is going to be served by the markup i am writing.

  5. Yoichiro Akiyama

    I totally agree with you Jeremy.

    The most basic and important thing is “what is the purpose of writing citations?”

    When we research something, it is very important to note the sources which was based on, so that anyone can track back the sources and verify the result (or process) of the research.

    Most of them might be “title of works”, but we should be aware that some of them are not. For example, fields like history, folklore, trial, journalism… sometimes their primary sources are individual’s statements. In that case, “title of works” will gonna be secondary or tertiary sources.

    Since citations are for tracking roots of evidences, I don’t think it is necessary to exclude those “primary sources” (peoples bear words). It could be extremely important point for those field’s professionals.

  6. Grant

    Excellent Jeremy. At first I thought this article might be some crazy rant of yours, but it makes perfect sense all the way, and I’m glad someone has said it. Keep up the good work on challenging the forced boundaries.

  7. Jylan Wynne

    This just goes to show that the W3C does make pretty stupid mistakes sometimes. I’m amazed that they’re actually advising people that the <b> tag is ok to use…

  8. Andy Ford

    “writing W3C specifications and smoking crack are not mutually exclusive activities” … still chuckling over that but now I’m not sure how to mark it up!

    Thanks for your semantic activism.

    @hixie – Leave CITE alone!

  9. DN

    YES! Precisely.

    Seeing recommendations for dialogue that include DLs give me the DTs. And some of things in the HTML5 spec are similarly bouts of needless vagueness that bleed into outright, imposed doublethink.

  10. Stephen Hay

    Wonderful article which shows the dark side of HTML5 at this point in time (and Jeremy has restricted himself to `cite`; there’s much more weirdness than that in HTML5).

    People have been citing persons (e.g. what she said) as well as works (e.g. what was said in her book) for ages outside of the web domain. Removing the citing of persons is ridiculous. The way HTML5 is trying to change the meaning of `cite` and other pre-web semantics is ignorant and dangerous.

  11. bruce

    I decided a while ago that, whatever the spec says here, I’ll continue using cite for names, just as I have under HTML 4.

    Breaking backwards-compatibility here, in an unenforceable way (because no validator can tell whether you’re citing a name or a work) seems to me one of the unnecessary restrictions that HTML5 has done a good job of avoiding elsewhere.

  12. trovster

    <can-o-worms>What about using <address> to mark-up a physical address. Makes perfect sense to me and IMO follows the same reasoning you make about <cite> in the article.

  13. Matt Newman

    I found the following to be very contentious:

    The entire justification for the change boils down to this line of reasoning:

    1. Given that: titles of works are often italicised and
    2. given that: people’s names are not often italicised and
    3. given that: most browsers italicise the contents of the cite element,
    4. therefore: the cite element should not be used to mark up people’s names.

    Isn’t that just your opinion of what the reasoning was?

    If that is cited somewhere as being the reasoning then you COULD use that as strong base for ripping the proposed usage of <cite> to shreads, possibly along with the argument presented by Yoichiro Akiyama (#c003493).

    But I have to ask – are you not missing the point? If you’ve got nothing to link to (i.e. you’re ‘citing’ a person) then what you’re actually doing is quoting them and as the spec says: <q cite=“http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-cite-element”>A citation is not a quote (for which the q element is appropriate).</q>

    Why do you need that markup around a name – what does it manage to convey that the simple flow of the document doesn’t?

    I’m all for revolution where it’s for good reason and it’s goal is valid but I think something a little twisted has happened here.

    I’m all for healthy debate but you seem to have jumped straight past argument and rebuttal right onto revolution.

  14. Steerpike

    Aside from the enjoyable Irish fire and humour I thought the most interesting part of this article was that it bought to my attention this part of the Priority of Constituencies that I hadn’t taken notice of before:

    In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.

    The full suggestion from the spec is:

    consider users over authors over implementors over specifiers over theoretical purity.

    this seems something that is unlikely to work in reality. Is it really the expectation that the browser makers are seriously going to prioritise users and authors over their own implementations of the actual technical specifications. This strikes me as incredibly unlikely and naive given browser and web history. I see there’s a few people who work for browser makers lurking in the comments (hey Bruce); can anyone expand a bit on what the practical implications of this part of the spec is for the browser makers and how likely they would be to abide by it (or even how on earth they could abide by it if they wanted to).

  15. John Faulds

    People have been citing persons (e.g. what she said) as well as works (e.g. what was said in her book) for ages

    But this is almost always a case of someone referring to the words of someone else to help make a point relating to their own message; a citation, or attribution, is always contributing to a point or argument being made.

    I don’t think a conversation, or a dialogue, is the same type of communication, and for that reason the ol li cite q example doesn’t sit well with me. I think dialogue needs its own element, but if we’re going to luck out in that area, then I guess we have to take the next best option, and if it turns out that the majority of authors go for it, then I guess I’ll follow suit as there probably won’t be any other option to consider.

  16. Chris Emerson

    I don’t think in the very first example you are defining ‘Alice’ as ‘I think Eve is watching’ – but you are defining what Alice SAYS as that. Think about it that way and you still have a definition, and it makes sense.

    Also,

    “In some cases, the b element might be appropriate for names

    I believe the colloquial response to this is a combination of the letters W, T and F, followed by a question mark.”

    Why not? It is a typographic convention in scripts for example to have the actor/character name in a bold typeface, and use of the b tag is for exactly that – typographic convention.

    “So we can disobey the specification without fear of invalidating our documents.”

    This is wrong – validation isn’t just about using an automated tool to ‘validate’ the document – it still has to follow the spec. Just because an online tool says a document is valid, doesn’t necessarily make it so – it is just 1 part of checking whether a document is valid.

  17. Matt Newman

    “I see little point in retrospectively making that an invalid use.”

    Is that what would happen?

    Don’t you actually validate against the doctype that you specify?

    So imho any element’s usage could be redefined wihtout retrospectively breaking how it used to be, because it is only validated against the doctype specified with the document itself.

  18. Rich Clark

    Jeremy,

    Brilliant article, I think almost everyone I’ve spoken to about this is all for keeping cite as is.

    Now, I don’t know why but as I was reading this I kept thinking about you reading this aloud. This made me think you should record it jackanory style – A tale of HTML5 woe….”.

    In fact maybe 24ways articles could all be served as audio? what do you say Drew?

    Ta

    Rich

  19. bruce

    Zcorpan, while Dan might wish he hadn’t written that, he did write it and so there are lots of pages with names in <code>cite</code> elements.

    I see little point in retrospectively making that an invalid use.

  20. Richard

    Great, so as a user and author I am officially superior to a browser. That makes me feel good, although only If I ignore the nagging “how good is Internet Explorer” in the back of my mind.

    The backwards compatibilitiness (sorry) of any standard must take precedence over individual parts of the standard, therefore cite must still be valid for the name of the person being cited. No need for any civil disobedience after all.

  21. Gingerskhan

    Semantic markup is intended for computers to better “understand” our content so that they can do something useful with it. At the most basic level, this would involve a browser or assistive technology presenting the content to us in an appropriate way.

    Rebelling against the specifications at the presentation level may have little consequence.

    However, as we move further into the semantic web where “intelligent” programs are trying to infer meaning from our content it would be useful if authors adhered to the specifications, but as anyone who has written programs for the semantic web will know, authors often deviate from the specifications.

    The semantic application has to perform extra checks on the content to ensure it fits with what is expected. The actual markup used in many cases is ignored.

    So if you really want to incite a riot, why even bother with semantic markup?

  22. Erik Vorhes

    Thank you for articulating what I’ve been thinking, Jeremy!

    I had been agitating for a redefinition of cite since May or so, on the WHATWG mailing list, and it has frequently felt like talking to a brick wall. I’ve effectively given up trying to change the spec.

  23. zcorpan

    Bruce, I think you misread the quote. Dan Connolly added cite to HTML2 with a definition very similar to the one in HTML5. Dan did not change the definition in HTML4 — the (previous) HTML WG did.

  24. Florent V.

    The WHAT-WG changed a few definitions and now you can’t micromanage insignificant semantics the way you wanted to. This is really sad.

    Seriously:

    - The idea that there should be a defined semantic construct for everything in HTML is flawed.

    - DIALOG was removed because of that. If HTML5 defines DIALOG, then it is entitled to define roughly 50 more semantic constructs inherited from existing print or online publications. BIBLIOGRAPHY, COMMENTS, WHATEVER. The choice of the WHAT-WG is to not offer specific semantics for every relatively popular use case out there, as that would make the spec more bloated (and most authors would just ignore those constructs and never actually use them). I think that’s a sound decision.

    - Obviously the spec still has some strange constructs. The transformation of DT and DD as a generic means to associate text content to an element (text to text, text to image) is a strange move. Is such a generic construct was needed, creating it would have been better in my opinion.

    - I won’t cry over the restriction in meaning of CITE. I was already using it for publications only, not people. One difference between people and publications is that you will want to always tag publication names with CITE, while you won’t want to always highlight a person’s name. Highlighting a person’s name is a more discretionary move, so resorting to neutral elements (SPAN) or an occasional STRONG element is a good move in my opinion.

    - I consider your use of an OL list to be some kind of abuse of that element. Likewise for the use of UL/LI for marking up comments in this very page. We should stop putting OL and UL everywhere because it’s “more semantic” (ha!). We shoud think about the benefits and inconveniences of the (loose) semantics we use when doing that. One funny thing about that trend of over-using lists is that’s it’s mostly done by English-speaking writers and coders, while the French ones don’t seem to do it (or not as much). I think it may come down to a few articles in English by influential HTML-CSS bloggers.

    I’ll respond to your riot incitation with another: stop obsessing over semantic micromanagement, people! Also, test in screen readers, that helps avoiding problematic semantic constructs (though screen readers may be wrong, too).

  25. Chris

    I’m not sure your use of <cite> is semantically-correct. A citation refers to a source, not an author. Sometimes a source contains the author’s name, such as in academic referencing (e.g. Cormack (1994, pp.32-33)).

    But most of the time it will refer to the title of the document.

    Using <cite> to encode who spoke the text inside a <blockquote> doesn’t seem semantically correct. Subtly, it would be correct to put the source as <cite>Spoken conversation, Albert Wotshisname</cite>, but not <cite>Albert Wotshisname</cite>.

    What do you think?

  26. Erik Vorhes

    Chris, you’re assuming that the only people worth citing are authors. If you’re dealing with reported speech, you need to cite the person, as there’s no other object to attribute those words to.

    The point of the WHAT-WG redefinition is to change <cite> into an element for styling purposes only. (I.e., “Book titles are italicized. <cite> by default italicizes text. Therefore, <cite> should be only for titles.”) If you want to use HTML to apply that styling, then, congratulations! HTML5 gives you two elements (<cite> and <i>) whose only purpose is to say “this is italicized text.”

    Because it’s possible to “cite” people in passing as well as in dialog—the list goes on—it would be ideal for the HTML5 specification of <cite> to accommodate the use-cases Jeremy describes. It makes the element more useful: It’s not just a styling hook but something that provides semantic value.

  27. Shelley

    I would suggest filing bugs in the HTML WG bugzilla database, and then following through on the bugs with change proposals if the suggestions in the bugs are not followed.

  28. Matt Newman

    I find myself back here again!

    @Erik et al who think that the whatwg are on the wrong side of this:

    Once again I would say that NOWHERE does the WHATWG suggest its reasoning is based on styling, that was purely this article author’s opinion!

    The usage you seem to be promoting as ‘more sematically correct’ is just using <cite> as a way of quoting someone – are the 2 html elements we currently have for that purpose insufficient?

    Whilst a common english usage of a citation can be to refer to a person, the scholarly usage is specifically about sources, if you’re refering to something a person said in converstation what problem do you have with considering that a quote rather than a citation?

  29. Erik Vorhes

    @Matt Newman,

    Have you read the email archives of the WHAT-WG mailing list? If not, there are over six months’ worth of evidence showing that the WHAT-WG’s definition of <cite> has everything to do with styling. If you’re curious, the discussion seems to have started on June 3.

    The usage Jeremy, others, and I advocate is not to use <cite> as one would use <blockquote> or <q> but to attribute a quote to its source (be it a person or whatever strikes your fancy) in a semantically meaningful way. I was in academia long enough to understand the difference between a quotation and a citation, thank you very much.

Impress us

Be friendly / use Textile