Recently, I stumbled across a quote from a Mozilla developer about the tension inherent in creating standards:
Implementations and specifications have to do a delicate dance together. You don’t want implementations to happen before the specification is finished, because people start depending on the details of implementations and that constrains the specification. However, you also don’t want the specification to be finished before there are implementations and author experience with those implementations, because you need the feedback. There is unavoidable tension here, but we just have to muddle on through.
Keep this quote in the back of your mind, and let me explain how HTML5 came to be.
This book is about HTML5, not previous versions of HTML, and not any version of XHTML. But to understand the history of HTML5 and the motivations behind it, you need to understand a few technical details first. Specifically, MIME types.
Every time your web browser requests a page, the web server sends “headers” before it sends the actual page markup. These headers are normally invisible, although there are web development tools that will make them visible if you’re interested. But the headers are important, because they tell your browser how to interpret the page markup that follows. The most important header is called Content-Type, and it looks like this:
Content-Type: text/html
“text/html” is called the “content type” or “MIME type” of the page. This header is the only thing that determines what a particular resource truly is, and therefore how it should be rendered. Images have their own MIME types (image/jpeg for JPEG images, image/png for PNG images, and so on). JavaScript files have their own MIME type. CSS stylesheets have their own MIME type. Everything has its own MIME type. The web runs on MIME types.
Of course, reality is more complicated than that. The first generation of web servers (and I’m talking web servers from 1993) didn’t send the Content-Type header because it didn’t exist yet. (It wasn’t invented until 1994.) For compatibility reasons that date all the way back to 1993, some popular web browsers will ignore the Content-Type header under certain circumstances. (This is called “content sniffing.”) But as a general rule of thumb, everything you’ve ever looked at on the web — HTML pages, images, scripts, videos, PDFs, anything with a URL — has been served to you with a specific MIME type in the Content-Type header.