Advertisement: Support SunWorld, click here!

 

September 1999
Home
Next Story
Printer-Friendly Version
Search
 
Topical Index
Backissues
SunWHERE
Subscribe, It's Free
Letters to the Editor
Events Calendar
TechDispatch Newsletters
Technical FAQs
Solaris Security
Secure Programming
Performance Q&A
SE Toolkit


Is XML changing the future of Web publishing?

How XML is becoming the new standard for building and managing content

Summary
XML appeared several years ago as an upcoming standard and partial replacement to HTML. Since then there's been a lot of talk, but no one has really specified how Webmasters can use it to their advantage on an everyday basis. This month, Allen walks you through the development of XML and its various Web implementations, and gives you a preview of what you can expect to see over the next year. (2,200 words)


  WEBMASTER  

By R. Allen Wyke
In 1997 the World Wide Web Consortium (3WC) proposed a new standard -- the XML specification. This specification, like its HTML sister, was created as a subset of the SGML language. Unlike HTML, the primary focus of which is formatting text data and linking external objects, XML allows you to write markup language tags that better define the data with which you work.

Over the last three months, XML has begun taking root, just as its inventors predicted. Financial institutions are now creating languages based on it that define the data they send back and forth; ecommerce vendors are using it for transactions; and reporting tools are using it to import data for ad hoc queries. You shouldn't count on XML to save the world -- but it can help you describe, consolidate, and validate your data, and basically make your job a whole lot easier. If you have to process Web logs so that your managers can determine their traffic numbers, peak hours, and the type of browsers used to access the site, XML may be your solution -- especially if these aren't the only numbers your managers have to analyze.

Let's say part of your job consists of developing a schema for all of your Web logs. At the same time, the marketing and sales team has developed a schema for items sold on your site, including advertisements. Now, on any given day you produce an XML documents that reports accesses to your site, the ads you've shown, and the widgets you've sold -- three different reports. But because you developed your schemas to overlap in relevant areas, all the data for that day can be read into an XML-enabled reporting tool and analyzed. This allows you to generate billing reports to send to your advertisers and suppliers, as well. Each department can implement the system in its own way, yet all of you can benefit from each other's data.

Currently, XML is not a replacement for HTML. Instead, it is used for transferring data between applications, or from one company to another. As is the case with HTML, the information encoded in XML can be read by anyone, but the latter language has an added benefit -- it lets the author define its tags. The reader can understand the document because of its schema, which is included or referenced within the XML document. Confused? Read on.

A simple XML example
Let's say your site publishes top stories and news, and, like the managers of most popular sites, you've resold your content to various portals. As you know, portal managers want to brand their content so it looks as if they developed it. This is where the use of XML can be of benefit to you: XML will allow your developers to create the content in an easily understandable form, then send it to you and other affiliate sites. Once you have it, you can apply any look and feel you desire. Take the following snippet as an example. Here is the XML document your content developers would send out:

<?xml version="1.0"?>
<!DOCTYPE content SYSTEM "content.dtd">
<content>
  <adcall type="image">
    <url>http://www.mysite.com/cgi-bin/adgen.cgi</url>
    <click>http://www.mysite.com/cgi-bin/adclk.cgi</click>
    <width type="pixels">468</width>
    <height type="pixels">60</height>
  </adcall>
  <article>
    <author>
      <name>R. Allen Wyke</name>
      <email type="work">allen.wyke@sunworld.com</email>
    </author>
    <body>
      Here is the article
    </body>
  </article>
</content>

It's pretty obvious at a glance what each section contains. These sections are defined in the schema (see below) and are written using XML.

<!ELEMENT content (adcall , article )>

<!ELEMENT adcall (url , click , width , height )>
<!ATTLIST adcall type CDATA #REQUIRED >
<!ELEMENT article (author , body )>

<!ELEMENT url (#PCDATA )>

<!ELEMENT click (#PCDATA )>

<!ELEMENT width (#PCDATA )>
<!ATTLIST width type CDATA #REQUIRED >
<!ELEMENT height (#PCDATA )>
<!ATTLIST height type CDATA #REQUIRED >
<!ELEMENT author (name , email )>

<!ELEMENT body (#PCDATA )>

<!ELEMENT name (#PCDATA )>

<!ELEMENT email (#PCDATA )>
<!ATTLIST email type CDATA #REQUIRED >

What precisely is happening here? XML is used to create a schema, which defines the tags used to describe your data. Once you have an XML schema, you can create an XML document that uses that schema. The document itself either contains the schema or references it along with the data you are encapsulating. This allows other readers to parse and validate the XML document, even though you've defined the tags yourself. In fact, if you choose to use the schema above, save it as CONTENT.DTD, then copy the XML document above and save it in the same directory as CONTENT.XML. To view it, you can load CONTENT.XML in Internet Explorer 5.

Think of it as your future. You develop a schema that defines your data, then you develop a skin (in XSL or CSS) that defines your look. Done. Suddenly all of your content is in an easily-validated XML document that references your schema, with a skin for the design. Need to redesign your whole Web site? Just change the skin. With a little practice and imagination, you'll be able to see XML's benefits in defining content, processing Web logs, and integrating with reporting tools used by other departments. I don't want to stop here, though; XML has popped up in several other areas as well.

The Resource Description Framework (RDF)
RDF, one of the first languages developed with XML, allows you to integrate various Internet metadata into a single formatted document. It provides sites with the ability to create site maps, rate content, search data, and supply other commonly needed information. RDF has even been sent to 3WC for adoption as a standard.

RDF lets Webmasters do cool things: for instance, you can create a site map that provides one-click access to the areas you want users to access -- a beautiful thing! And because no one knows a site better than its Webmaster, there's no one better to create its map. Plus, this map can be downloaded by a browser and displayed as a list of bookmarks or links, so expect to see much more about RDF in the near future.

The Extensible User Interface Language (XUL)
Netscape has regained prominence as the Internet technology leader and taken Web development well beyond the design, construction, and posting of pages. With its Communicator 5, which will hopefully be in beta by the end of the year (it's currently in prealpha milestone builds), you can apply skins to the browser. This allows Web developers to create browser chrome for their users and create skins that look and function in a manner that best fits their sites. Have a site for kids? Why not create a skin that contains no words, but is rather built around buttons that look like crayon drawings or stick figures? Want to design a skin that is optimized for your financial site, complete with custom links and buttons to help your users track their financial status? No problem; you can do that, too.

Because it's a work in progress, XUL is still being defined, and I can't go into more detail. I am, however, planning a full article on the topic in the near future that will outline how to use it on your site. Until then, you can visit the Mozilla project homepage, or go to the Mozilla Chrome Zone for more information on skins, demonstrations of XUL, and an outlook on what this XML implementation will mean. There are links to both sites in the Resources section below.

Resources to help you get started
In this section I'm going to try and narrow the topic to include some of the core knowledge and applications you'll need to get up and running fast. As XML becomes more widely accepted, documentation, tools, and other applications are becoming generally available. There are already a host of articles written on the topic, and large companies often have press releases stating their approach to using it. The Web is another good place to search for the documentation and tools you'll need for your planning and implementation, obviously.

I found XML to be a bit tricky at first, mostly because I had to mentally separate it from HTML. And it didn't help that the first thing I heard was something to the tune of, "XML is a new language that allows you to create your own 'languages,' and it is the future." Gee, thanks; I guess I should just drop everything immediately and begin using it.

As a result of the prevalence of this sort of rhetoric, I had a hard time trying to find resources that told me why I should learn XML. All the articles I read were primers to learning its markup, not real-world examples of how, and why, to use it. Fortunately, there are now some good resources available, both on the Web and in your local bookstore, for teaching the beginning XML developer how to develop schemas and write XML documents.

Here is a list of my favorite XML-related sites, and the reasons why they are useful:

In addition to Web resources, you should also check out your local bookstore. In today's market, there are tons of computer books out there, and you can browse for days. I've purchased many books (and dug into several others), and recommend the following:

Having learned HTML by hand, I have a tendency to steer clear of applications that generate my documents for me. With XML, however, such products are not a bad idea. Good XML schema development tools will not only write the syntax for you, but also give you a graphical representation of how the components interrelate. As your schemas get larger and larger, it's difficult to maintain a mental image of the overall picture, so a graphical representation makes it easier to maintain perspective. Think of it as looking at a database schema versus a list of tables, where the figure below shows a schema's graphical representation in Extensibility's XML Authority.


Graphical representation of XML schema

Hopefully this month's article has shown what it is about XML that might be of interest to a Webmaster. It was not meant to teach you XML, but rather to expose the Webmaster's angle of implementing it on your site.

Allen Wyke About the author
R. Allen Wyke has developed intranet pages and Internet sites for several leading companies, and often speaks on the topic of HTML. His writing credentials include coauthoring Pure JavaScript, The Perl 5 Programmer's Reference, and The Official Netscape Navigator 4 Book, as well as contributing to The HTML 4 Programmer's Reference and HTML Publishing for the Internet, 2nd Edition.

Home | Next Story | Printer-Friendly Version | Comment on this Story | Resources and Related Links

Advertisement: Support SunWorld, click here!

Resources and Related Links
  Additional SunWorld resources  

Tell Us What You Thought of This Story
 
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough
 
 
 
    
 

(c) Copyright 1999 Web Publishing Inc., and IDG Communication company

If you have technical problems with this magazine, contact webmaster@sunworld.com

URL: http://www.sunworld.com/swol-09-1999/swol-09-webmaster.html
Last modified: