Click on our Sponsors to help Support SunWorld

Tools to meld the Web & relational databases, Part 1

Web-aware database development tools promise to make client/server app deployment easier

By Bill Rosenblatt

January  1996
[Next story]
[Table of Contents]
Subscribe to SunWorld, it's free!

Currently the most important shortcoming of the Web as a client-server platform is the difficulty of interfacing with data in relational or other kinds of databases. Fortunately, tools are becoming available that ease communication between the Web's Common Gateway Interface (CGI) and relational data. This month we take a look at some of these tools. (2,000 words)

Mail this
article to
a friend

In last month's column, we considered the idea of the World Wide Web and its constituent language technologies as a potentially powerful client-server rapid development paradigm. The advantages to using the Web architecture are already substantial: It abstracts away most of the details of TCP/IP communication, gives you automatic, implicit cross-platform client portability, and provides a document metaphor for displaying data that is very easy to implement and natural for the user. Web application development has already advanced significantly beyond the need to use ASCII text editors to write HTML and perl code by hand, but we have some distance to go before the Web architecture eclipses Powersoft and its ilk for sheer client-server development muscle. Still, tools have been appearing at an amazingly rapid rate -- though not, perhaps, from where you might expect.

Currently the most important shortcoming of the Web as a client-server platform is the awkward way that users must access data managed in relational or other kinds of databases. The Web's Common Gateway Interface (CGI) is limited and needs augmenting. Fortunately, assistance in the form of middleware between CGI and relational data is becoming available. We'll take a look at the simpler of these tools this month.

The data bottleneck
Let's review the basic Web architecture so that we can see where the bottlenecks to application development lie. CGI lets you build Web pages with forms on them. Forms contain variables whose values can be set by the user through fields, radio buttons, check boxes, etc. When the user clicks on a button to execute the form, the CGI program on the server receives the values of the variables, acts on them in some way, and returns a new HTML page to the user. This seems like a reasonable start for an application that queries a relational database: the form could be used as a front end, in the same way as many client-server query tools, like Powersoft's InfoMaker or Platinum Technology's Forest & Trees. But that isn't enough.

Consider what a CGI program would have to do to process a database query:

  1. Read the query parameters from the user's form.
  2. Construct a SQL query and send it to the database.
  3. Retrieve the results of the query and format them into HTML.
  4. Send the resulting HTML page back to the client.

The basic Web architecture takes care of the first and last of these steps, but it doesn't help with the middle two. Without extra tools, you have to do all of that by hand within your CGI program.

The first question is, how do you query a database from a CGI program? CGI programs can be written in any language that can call external Unix programs using standard C-language calling conventions. Relational database vendors, meanwhile, give you routine libraries that you can link into your C code. Therefore, it would seem that you must write CGI programs in C to query your database. Yet the vast majority of CGI programs are written in perl, the powerful freeware interpreted language.


Perls of great price
Enter our first group of database tools for the Web: sybperl and oraperl. These are nothing more than perl implementations of the C routine libraries for Sybase and Oracle respectively. Sybperl was developed principally by Michael Peppler in Switzerland, while oraperl was developed by Kevin Stock in the UK. Like perl itself, they are freeware, publicly available under the GNU General Public License. Source for both is available in many software archives, follow the hyperlinks above for two of them.

Another technique for accessing a relational database from the Web is to turn your database programs into CGI programs directly, instead of involving perl as "middleware." If you already have an Oracle database with lots of application code written on the server as database stored procedures in PL/SQL, then you're in luck. You can turn your stored procedures into a CGI program with WOW (Web-Oracle-Web), a set of PL/SQL code that implements the CGI interface and encapsulates HTML formatting commands.

The next step in integrating relational databases with the Web is to read query results and format them into HTML. This ought to be easy to automate, especially if you are using the table formatting extensions in Netscape's HTML. Various tools do this type of thing and most of them have origins in the work of Arthur Secret at CERN in 1992. His Oracle gateway let a user enter a query directly in SQL, retrieve the results, and send them back as a plain-text HTML page.

Another basic way to query a database from an HTML page is to embed a query within an HTML page (using some extra HTML keyword or other syntax), process that query dynamically on the server, substitute the results for the query within the HTML page, and send them back. One program that does this for Oracle is Decoux, named for its author, Guy Decoux of France. The Decoux gateway, an Oracle PRO*C program, uses HTML pages on the server that have templates for SQL queries between the keywords <SQL> and </SQL>. The query templates can have value substitution markers (e.g., %s for a character string, as in the C language) and formatting codes that format the query results (such as <LI> for a list item).

The Decoux gateway reads values the user sends in through a form on a Web page, uses those values to substitute into the SQL query, and executes the query. Then it retrieves the results of the query and inserts them into the HTML page template with each row marked with the appropriate HTML formatting code (e.g., <LI>), and it sends the resulting complete HTML page back to the user. This works nicely, but it has somewhat limited ability to format query results.

Other tools are available that extend the Decoux gateway concept by, for example, including more formatting options or automating generation of HTML templates from database table definitions. We'll examine some of these tools in next month's column. These tools help take the drudgery out of implementing simple data queries, but they still aren't enough. One major remaining shortcoming is that the user has no easy way of interacting directly with the data. The paradigm, fill out a form, click a button, get results on another HTML page, offers no easy way to see inputs and outputs on the same page; nor is there an easy way to have the client act on data received from the database server.

In other words, there is no way to write code that runs on the client upon receipt of data from the server, as there is in many client-server development tools. This capability is not likely to be available until one of two things happen. The first possibility is that a Web client scripting language, like Netscape's LiveScript/JavaScript (see last month's column), becomes capable of more sophisticated integration with HTML pages. The second, and more likely, scenario is that an HTML browser with a full-featured API appears; an API would enable arbitrary code to be written that accesses and manipulates HTML pages. Netscape is rumored to be working on such a browser API.

Yet another use for hypertext
Even now, the Web's built-in hypertext capabilities give it database interaction capabilities that go beyond the now-standard forms-oriented querying capability. A few WWW/database tools have been developed that take advantage of hypertext. The pioneer in this area was the UMass Information Navigator, another product of academia -- this time the computer science department at the University of Massachusetts at Amherst (which happens to be my grad-school alma mater). Rick Hudson, who created the Navigator a couple of years ago, came up with a clever and very useful isomorphism between relational joins and hypertext links. If you use the Information Navigator to query a relational database, it will retrieve and display the results -- as the tools discussed above do. But then it will also find other tables in the database that have data fields in common with your query results, and it will give you the opportunity to jump to those other tables by clicking on the relevant pieces of data. It finds these other tables and performs the appropriate joins on the fly.

See Hudson's UMass graduate school database for a working example. You can use a form to retrieve a list of faculty members (among other things) that match your query terms. Then you click on the one that you're particularly interested in, and you get a display that shows information about that faculty member, followed by a list of hyperlinks to information on graduate students, courses taught, department, research publications, and so on. If, for example, you click on Faculty Publication Citations, you get a list of publications whose authors include the professor you selected.

This is a very clever use of hypertext, especially because it makes use of the logical structure of an existing relational database to perform on-the-fly joins that make sense. But there could be aspects of that logical structure that subvert the utility of this automatic mechanism. Essentially, the Navigator's design assumes that the relational database design (table definitions) is the same as the database's conceptual data model. This is often not the case, for two reasons: first, the table definitions could be denormalized for reasons of query efficiency, creating lots of common field (column) names across tables that aren't interesting from the user's point of view. Second, relational designs typically contain many auxiliary lookup tables to which applications join implicitly; a user should not have to link to such tables explicitly. As a simple example, a travel itinerary database might contain three-letter codes for airports (SFO for San Francisco, EWR for Newark, etc.). Query results that display, say, names, dates, and destinations should implicitly give the full city names and not require the user to click on abbreviations.

The solution to these problems is that tools like the UMass Information Navigator should work from a data model instead of actual relational table definitions. The hypertext scheme is particularly well suited to data modeling schemes such as entity-relationship and object-oriented models, in which the relationships between data tables are made explicit. In fact, commercial client-server development tool vendors would do well to incorporate this type of hypertext capability into their tools -- it's yet another exciting, unforeseen use of Web technology. It's also not likely to happen in the near future. Why? We'll see in next month's column. But in the meantime, here's a clue, think about where all of the Web/database development tools that we discussed in this column came from -- or, more particularly, where they didn't come from.

Click on our Sponsors to help Support SunWorld


What did you think of this article?
-Very worth reading
-Worth reading
-Not worth reading
-Too long
-Just right
-Too short
-Too technical
-Just right
-Not technical enough

[Table of Contents]
Subscribe to SunWorld, it's free!
[Next story]
Sun's Site

[(c) Copyright  Web Publishing Inc., and IDG Communication company]

If you have technical problems with this magazine, contact

Last modified: