Webmaster by Chuck Musciano

Moving to a new standard -- HTML gets a makeover, part three

Interesting new attributes in HTML 4.0

June  1998
[Next story]
[Table of Contents]
Sun's Site

HTML 4.0 is the version we've all been waiting for. In the third of a three part series, Chuck Musciano looks at a number of small but important enhancements in HTML 4.0. What are the new attributes, and which tags are being declared obsolete? (2,000 words)

Mail this
article to
a friend
We're wrapping up our tour through HTML 4.0 this month. In April we went through all the new table features, in May we followed up with all the new forms features, and this month we'll close out with a look at a number of small, but important, additions to HTML in the 4.0 standard.

I'll preface this column with a warning: Almost everything we'll cover is not yet implemented by any browser. While you won't be able to test all of the things I show you this month, you will be able to begin planning how to use them when the next browser versions are shipped. With a little practice now, you can be the first on your server to use these new extensions later this year!

A fond farewell
We all have them: Favorite little tags that we use over and over, even though we know there are better ways to accomplish the same task. Who hasn't aligned a block of text with <center>? Sure, you know you should use the text-align property on a <p> or <div> tag -- but isn't <font> so much quicker and easier than defining a style with the font property?

Like a watchful mother ensuring that her kids grow up right, the HTML 4.0 standard has labeled a number of tags and attributes "deprecated." Deprecated tags are on the way out; the deprecated state means that you can still use them for the time being, but you need to begin weaning yourself from them now, before they're officially declared obsolete.

Think it won't happen? Guess again. The HTML 4.0 standard officially declares the <listing>, <plaintext>, and <xmp> tags obsolete and no longer part of the standard. Their use might be allowed by a browser, but any document containing them cannot be labeled as HTML 4.0-compliant.

What's next? Here are the deprecated tags in HTML 4.0, doomed to be obsolete in (presumably) HTML 5.0:

<applet> The <object> tag
Appropriate stylesheet font properties
<center> The text-align stylesheet property
The <ul> tag with appropriate styles applied
<isindex> Conventional forms
The text-decoration stylesheet property

Before you start getting all choked up over the loss of your favorite tags, keep in mind that almost all attributes dealing with visual presentation are deprecated as well. Attributes like align, background, bgcolor, height, size, and width are deprecated in most usages, replaced by appropriate stylesheet properties.

Many authors would argue that these attributes and tags are convenient ways to accomplish common tasks, tasks that will be made more difficult by forcing the use of other tags and style sheets. On the other hand, consistent use of the right tags, attributes, and styles makes your documents easier to maintain and use. Like it or not, the gentle mothering of the HTML 4.0 standard will be keeping you on the straight and narrow, HTML-wise.

And fret not. While the standard will someday eliminate those tags and attributes, you can be sure it will be a long, long time before any browser abandons them. If the standard is like your mother, browsers are like back-alley hoods, trying to make it easy for you to break the rules. The question is, where does your conscience lead you?

Better descriptions
A picture may be worth a thousand words, but HTML 4.0 leaves nothing to chance. It provides a number of ways to better describe almost every element in your document.

The alt attribute allows you to attach a description to the <img> and <area> tags. Some browsers make use of these attributes, displaying the associated text when the mouse passes over the image or area. Nongraphical browsers, like Lynx, use the text in lieu of the image. The problem is that many authors forget to add alt attributes to their images or area definitions.

HTML 4.0 requires that every <img> and <area> tag have an alt attribute. Of course, we all know that browsers will not strictly enforce this rule, but you won't be able to declare your documents HTML 4.0-compliant if you leave alt out of the picture.

If you simply cannot fit your image and area descriptions within the alt attribute, you can now turn to the longdesc attribute. As its name implies, the value of the longdesc attribute is the URL of a document containing the long description of the object. This attribute may be most useful when your image is worth more than a thousand words.

Another interesting new attribute is title. This attribute lets you associate a bit of descriptive text with almost any tag. It isn't clear how this text should be used, but browser manufacturers will come up with something. Internet Explorer displays an element's title when the mouse passes over, emulating the "tool tip" help used in many Microsoft tools. Even if the browser ignores the element titles, they make for a useful way to store additional information about any tag right in your document.


International support
As the Web has grown to encompass the world, HTML has grown to include more features that support a broad range of languages and presentation styles. HTML 4.0 is distinctly more international in flavor, thanks in large part to the new lang and dir attributes. Both of these attributes can be applied to almost any HTML tag, allowing you to specify language-specific information for an entire document or just a single word.

The value of the lang attribute is a two-character ISO language code. This code tells the browser which language is being used for the text within the element. Presumably, the browser will alter the way it displays the text to match the common presentation used for that language.

You can further refine the value used for the lang attribute by adding a hyphen and a dialect name. For example, using lang="en" tells the browser that you are using English, while lang="en-us" narrows that down to U.S. English.

In most cases, you should use the lang attribute on the <body> tag of your document to indicate the language used for the entire document. If you have a text element that is in a different language than the rest of your document, add the lang attribute to the element enclosing that text.

The dir attribute specifies the direction in which the text is rendered, and accepts values of either ltr (left-to-right) or rtl (right-to-left). The default is ltr.

This attribute helps those authors creating documents in languages such as Hebrew or Chinese, ensuring that they will be rendered correctly by the browser. As with the lang attribute, you'll probably want to use the dir attribute with the <body> tag, overriding this document-wide setting for individual tags that might need to be rendered in the opposite direction.

Editing support
Traditionally, HTML has given little support for automated document production. Early versions of HTML included the <nextid> tag to create unique sequence numbers within a document, but few, if any editing tools used the tag. Until HTML 4.0, no tags existed to support the process of document creation and editing.

Within HTML 4.0 you'll find two new tags: <ins> and <del>. They can be used in almost any context within a document to delimit a region of markup that has been either inserted into or deleted from that document. A conforming browser would use the tags to display the document in an appropriate manner. If, for example, a user wanted to see the end result of his or her edits, the inserted text would be displayed and the deleted text removed. If an editor wanted to see the changes made to a document, the deleted text might be displayed with a line struck through it.

Both of these tags accept two attributes: cite and datetime. The cite attribute supplies the URL of a document explaining why the associated edit was made, and might include other information like the author's name. The datetime attribute indicates the time at which the edit was made. Using these timestamps, a browser might even be able to re-create a version of a document as it existed at a specific point in time.

Link targets everywhere
From its earliest versions, HTML has supported what is sometimes called a fragment identifier: a label associated with the <a> tag that allows a URL to jump to a specific spot within a document. Thus, you might place

     <h2><a name="section1">Section 1</a></h2>

in your document, and jump to that spot from another document using

     <a href="document.html#section1">

This is a great idea, but is sometimes tedious to implement, given all the extra <a> tags it requires.

HTML 4.0 makes the creation of fragment identifiers much easier by adding the id attribute to just about every tag in the standard. This attribute defines a name, just like the name attribute that can be referenced by a URL pointing to the document. To re-create our example in HTML 4.0, you would enter:

     <h2 id="section1">Section 1</h2>

With the ability to name any HTML element, automated tools can more easily extract sections of documents for later processing. For example, by naming the paragraphs containing your documents' abstracts with id="abstract", you can quickly find all your abstracts. In addition, the names created with the id attribute can be used to create style classes that can be applied to the tag independent of normal HTML styles. Thus, if you have two style rules like these

     .blue { color : blue }
     #bold { font-weight : bold }

You could create a bold, blue paragraph with

     <p class=blue id=bold>

Since the value of the id attribute must be unique throughout the document, only one tag within the document can be given the #bold style.

More sensitive elements
With the development of JavaScript, HTML 3.2 added a few event handlers to certain tags. Netscape and Internet Explorer went further, making many tags sensitive to various input events like mouse motion and keypresses. HTML 4.0 standardizes the kinds of events that almost any element can respond to with a set of element event handlers.

The events recognized by HTML 4.0 are:

<onclick> Any mouse button was clicked
<ondblclick> Any mouse button was double-clicked
<onkeydown> A key was pressed down
<onkeypress> A key was pressed and released
<onkeyup> A key was released
<onmousedown>   A mouse button was pressed down
<onmousemove> The mouse moved within the element
<onmouseout> The mouse moved out of the element
<onmouseover> The mouse moved into an element
<onmouseup> A mouse button was released

Some of these attributes correspond to older, nonstandard attributes. For example, onmouseover and onmouseout correspond to the onfocus and onblur attributes.

For all of these attributes, the value is a bit of executable script, usually JavaScript. You might invoke a JavaScript routine defined elsewhere in the document, or you might stick a single statement or two directly into the attribute value. Either way, these attributes give you a consistent way to associate dynamic activity with any element in your document.

Moving ahead
As I noted when we started, you can't really test most of these new features, as they won't be supported until the next release of Netscape and Internet Explorer. Until that happens, you may want to start thinking about how you'll need to change your documents to conform to the new standard. More importantly, start thinking about using these new features to make your documents easier to use, more inviting, and easier to maintain.


About the author
Chuck Musciano has been running various Web sites, including the HTML Guru Home Page, since early 1994, serving up HTML tips and tricks to hundreds of thousands of visitors each month. He's been a beta tester and contributor to the NCSA httpd project and speaks regularly on the Internet, World Wide Web, and related topics. Chuck is currently CIO at the American Kennel Club. Reach Chuck at chuck.musciano@sunworld.com.

What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough

[Table of Contents]
Sun's Site
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-06-1998/swol-06-webmaster.html
Last modified: