Extensible Hypertext Markup Language, or XHTML, is a markup language that has the same expressive possibilities as HTML, but a stricter syntax. Whereas HTML was an application of SGML, a very flexible markup language, XHTML is an application of XML, a more restrictive subset of SGML. XHTML 1.0 became a World Wide Web Consortium (W3C) Recommendation on January 26, 2000.
XHTML is the successor to, and the current version of, HTML. The need for a more strict version of HTML was felt primarily as now web content needs to be delivered to many devices (like mobile devices ) apart from traditional computers, where extra resources cannot be devoted to support the generosities of HTML. (The looser the syntax of a language, the harder it is to process.) A DTD defines the rules of XHTML, against which documents can be checked.
Most of the recent versions of popular web browsers render XHTML properly, and many older browsers will also render XHTML as it is mostly a subset of HTML and most browsers do not require valid HTML. Similarly, almost all web browsers that are compatible with XHTML also render HTML properly. Some say this is slowing the switch from HTML to XHTML.
XHTML's true power is realized when used in conjunction with Cascading Style Sheets; this makes the separation of style and content an integral part of the web page's code and with mixing different XML applications (such as MathML or SVG) in a single document.
The changes from HTML to transitional XHTML are minor, and are mainly to increase conformance with XML. The most important change is the requirement that the document must be well formed and all tags must be closed and semantically rich. Additionally, in XHTML, all tags must be written in lowercase. This is in direct contrast to established traditions which began around the time of HTML 2.0, when most people preferred uppercase tags. In XHTML, all attributes, even numerical ones, must be quoted. (This is not mandatory in SGML (hence in HTML), where quotes are not required if the content consists only of alphanumeric and certain allowed special characters.) All elements must also be closed, including empty elements such as
br. This can be done by adding a closing slash to the start tag:
<img … /> and
<br />. Attribute minimization (e.g.,
<option selected>) is also prohibited; instead, use
<option selected="selected">. More differences are detailed in the W3C XHTML specification.
Versions of XHTML
The original XHTML W3C Recommendation, XHTML 1.0, was simply a reformulation of HTML 4.01 in XML. There are three different 'flavours' of XHTML 1.0, each equal in scope to their respective HTML 4.01 versions.
- XHTML 1.0 Strict requires that all tags be well-formed, and deprecates many elements and attributes found in HTML 4.01.
- XHTML 1.0 Transitional is designed for an easier transition from HTML, and allows some common elements and attributes not found in XHTML 1.0 Strict to be used, such as
XHTML 1.0 Frameset: Allows the use of HTML framesets.
The most recent XHTML W3C Recommendation is XHTML 1.1: Module-based XHTML. Authors can import additional features (such as framesets) into their markup. This version also allows for ruby markup support, needed for East-Asian languages (especially CJK).
This is the specification that the W3C recommends all new web pages be created in.
The XHTML 2.0 draft specification
Work on XHTML 2.0 is, as of 2004, still underway; in fact, the DTD has not even been authored yet. The XHTML 2.0 draft is controversial because it breaks backwards compatibility with all previous versions, and is therefore in effect a new markup language created to circumvent (X)HTML's limitations rather than being simply a new version.
New features brought into the HTML family of markup languages by XHTML 2.0 state that:
- HTML forms will be replaced by XForms.
- HTML frames will be replaced by XFrames .
- The HTML Document Object Model will be replaced by XML Events , which uses the XML Document Object Model.
- A new list element, the
<nl> element, will be included in order to specifically designate a list as a navigation list. This will be useful in creating nested menus which are currently created by a wide variety of means.
- Any element will be able to contain a hyperlink, e.g.,
- Any element will be able to reference alternative media with the
src attribute, e.g.,
<p src="lbridge.jpg" type="image/jpeg">London Bridge</p> will replace
<img src="lbridge.jpg" alt="London Bridge" />.
<img src="" alt="" /> element has been removed in favor of
<object type="MIME/ContentType" src="">Alt</object>
- The heading elements (i.e.
<h3>, etc.) will be deprecated in favour of the single element
<h>. Levels of headings will instead be indicated by the nested
<section> elements each with their own
- The presentational elements
<tt>, still allowed in XHTML 1.x (even Strict), will be absent from XHTML 2.0. The only presentational elements remaining will be
<sub> for superscript and subscript respectively.
Others in the XHTML family
XHTML Basic: A special "light" version of XHTML for devices which cannot use the full XHTML set, primarily used on handhelds such as mobile phones. This is the intended replacement for WML and C-HTML.
- XHTML Mobile Profile: Based on XHTML Basic, this OMA (Open Mobile Alliance, www.oma.org) effort targets hand phones specifically by adding mobile phone-specific elements to XHTML Basic.
Validating XHTML documents
An XHTML document that conforms to the XHTML specification is said to be a valid document. In a perfect world, all browsers would follow the web standards and valid documents would predictably render on every browser and platform. Although validating your XHTML does not ensure cross-browser compatibility, it is recommended. A document can be checked for validity with the W3C Markup Validation Service.
For a document to validate, it must contain a Document Type Declaration, or DOCTYPE. A DOCTYPE declares to the browser what Document Type Definition (DTD) the document conforms to. A DTD should be placed at the very beginning of an XHTML document. These are the most common XHTML DTDs:
- XHTML 1.0 Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
- XHTML 1.0 Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
- XHTML 1.0 Frameset
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
- XHTML 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
A character encoding must also be specified at the beginning of an XHTML document. Once an XHTML document has a DOCTYPE and character encoding specified, it can be run through a validator (such as the W3C Markup Validation Service) to see if it meets the standard. Validation will locate and describe errors in XHTML markup.
When a page is validated using the W3C Markup Validation Service, the W3C returns a small icon that you may place on your document to show that it conforms to web standards. The W3C also offers validation for CSS.
Some of the most common errors in XHTML are:
- Not closing empty elements (elements without closing tags)
- Not closing non-empty elements
<p>This is a paragraph.<p>This is another paragraph.
<p>This is a paragraph.</p><p>This is another paragraph.</p>
- Improperly nesting elements (elements must be closed in reverse order)
<em><strong>This is some text.</em></strong>
<em><strong>This is some text.</strong></em>
- Not specifying alternate text for images (using the
alt attribute, which helps make pages accessible for devices that don't load images or screen-readers for the blind)
<img src="/skins/common/images/poweredby_mediawiki_88x31.png" />
<img src="/skins/common/images/poweredby_mediawiki_88x31.png" alt="MediaWiki" />
- Putting text directly in the body of the document
<body>Welcome to my page.</body>
<body><p>Welcome to my page.</p></body>
- Nesting block-level elements within inline elements
- Not putting quotation marks around attribute values
- Using the ampersand outside of entities (use
& to display the ampersand character)
<title>Cars & Trucks</title>
<title>Cars & Trucks</title>
- Using uppercase tag names and/or tag attributes
<BODY><P>The Best Page Ever</P></BODY>
<body><p>The Best Page Ever</p></body>
- Attribute minimization
This is not an exhaustive list, but gives a general sense of errors that XHTML coders often make.
XHTML 1.x is backward compatible with HTML when served as text/html. However, there are problems associated, especially for Internet Explorer. For more information, please refer to HTML, HTTP, and MIME standards in the Internet Explorer article.