Is a hybrid database in your future?
The best of object-oriented databases and relational databases are merging into object-relational databases. Where this new model fits into your corporate environment will make you rethink your RDBMS plans
The combination of the Web and pressing business needs is finally bringing object-oriented databases to the fore in business. Theoretically object-oriented databases are a better fit for the kind of complex queries that modern businesses need. But many companies are hedging their object-oriented bets by relying on hybrid "object-relational" systems. (2,600 words)
If you answer "yes" to any one of these questions, then almost certainly there is an object-oriented database management systems (OODBMS) in your future. In fact, the question isn't likely if we'll be using object-oriented databases in the future, but rather to what degree will they be object oriented? Even the relational database companies, such as Oracle and Sybase, are adopting object technology in proportions ranging from lip service to recasting their products into hybrid "object-relational" databases. That isn't good enough for the object purists, who believe the advantages of completely object-oriented databases will make them the dominant database architecture.
Why object-oriented? Benefit #1: Flexibility
with complex data types
"The advantages [of object-oriented databases] boil down to the fact that information is getting inherently more complex," says Mark Rawlins, vice president of marketing for UniSQL Inc., a pioneer in object-relational database systems. "The sorts of things people want from their information assets [today] are much more complex business models. One of the classic ones is `show me the moving average over 13 weeks of sales of our most profitable five products.'"
That's the sort of simple question the vice president of sales might ask, but Rawlins says getting that kind of data out of a relational database is difficult. "In order to find those five, you have to find the cost of all the products and their prices, subtract the two to get the profit and then find the sales of the five most profitable," he points out. "Then you get that for this week and compute the rolling average so you get a trend. That's fairly heavy lifting and fairly sophisticated SQL programming when dealing with a relational database."
According to object technology's backers, this sort of exercise displays the key limit to the relational model: what happens when the data and queries get very complex. The usual way of thinking of a relational database is as a table with data stored in rows and columns. As long as you're dealing with one table, a modern relational database management system (RDBMS) can be an extremely powerful tool for getting fast views of the relationships between the elements in the table.
The same things that make an RDBMS so efficient at dealing
with a single table, however, make it harder to combine information from
different tables -- a
join in RDBMS-speak. You can do
joins on any RDBMS worthy of the name, but you pay a performance
penalty, and if your query involves a lot of joins pulling data from
many different tables, then you may have serious trouble.
The other difficulty with the relational model is that in order to store data in a relational database it has to be broken down into atoms of information with each piece fitted into the appropriate place in the table. Again, when the relationships are simple this isn't a problem. If they are complex the process has been likened to taking a car apart to store it in a garage.
An OODBMS's advantages in complex queries come from two things. First, unlike a RDBMS row or column, an object is not just a piece of data. It includes enough code to "know" what it is and how it relates to other things. In objects, relationships are included with the object, not extracted from a series of tables every time they are needed.
"In an object database, you can simply declare a relationship, which can be a direct traversal [of the database]," says Andrew Wade, vice president of corporate development for Objectivity Inc., a Mountain View, CA maker of object-oriented database management systems. "If you're getting $20 out of a checking account that may not make much difference, but if you trying to keep track of all the dependencies between various processes [in a factory] and the production history, you get a lot of advantages."
Benefit #2: Increased performance and flexibility
The second benefit is that in an OODBMS, the relationships among instances of objects are not built up for each query as they are with data in an RDBMS. They are, in effect, calculated when the instance is entered into the database. Since queries greatly outnumber commits in most DBMS applications, this makes queries involving joins much faster.
"An object-oriented database is built around a collection of objects," says Brian Edwards, director of corporate marketing, at Gemstone Systems Inc., a Beaverton, OR.-based maker of object-oriented database systems. "In many senses it's almost like a network. All the joins you would have are already precomputed. When you do a commit the joins are automatically set up instead of having to create the joins at run time."
If that last statement sounds vaguely familiar, it's because it's similar to the way the predecessor to relational databases, hierarchical databases, work. In a hierarchical database the relationships between data elements were defined when the database was designed. Hierarchical databases had the reputation for being very fast, but quite inflexible.
Unlike hierarchical databases, an OODBMS is, or can be, quite flexible. That's because objects can be defined or redefined on the fly. (In a hierarchical database they were essentially fixed when the database was designed.) Objects are members of classes, and they inherit the properties of those classes. Further, you can easily create new subclasses out of old ones by adding to the definition. This can speed up developing applications or specialized reports enormously.
Of course, any database that is more flexible -- that allows the user to be more efficient -- is also more productive. By eliminating unnecessary code, you eliminate overhead that can cause system slowdowns. This is not to say that using an OODBMS will automatically speed up your database operations, but it will potentially do so in some object-based applications.
Benefit #3: Logical distribution
The third big advantage of OODBMS is that they are inherently distributable. "An object is a logical unit to distribute," says Objectivity's Wade. "If you're structured around tables you might have one table with all your customers. How do you split that up in the distribution? But if this group is being processed with parameters as an object it can be moved from one machine to another."
Using the Object Request Broker (ORB) mechanism, objects and applications scattered over the network can find each other seamlessly. By contrast, a relational database has a natural size that is determined by the size of the tables. It is fairly easy to distribute the tables in an RDBMS, but clumsy to break up the tables themselves. Objects, even complex objects, are usually much smaller than the equivalent tables.
The wild card in this scenario is the Internet, and particularly the World Wide Web. In the past year, the proliferation of Web and Internet awareness has stirred things up in the database world, often to the point where the major players are changing their strategies from month to month.
In one sense, the Internet has been a huge boon to the OODBMS vendors. In another, it left them scrambling just like everyone else. For example, until about 18 months ago it was hard to find people in the OODBMS community who took Java seriously as an object programming language. Their efforts were focused on C++ and Smalltalk.
Benefit #4: Built for the Web
Java is emerging as the big winner in the stampede to the Web. Java is object-oriented by nature and specifically designed to produce distributed applications. As a result, object-oriented database vendors are adopting Java at a speed approaching a mild panic. Objectivity, for example, recently announced that a version of its Objectivity database written in Java is due to ship in June.
Generally speaking, everything on the Web is an object. URLs are objects. Images are objects. Documents are objects. When you want to store an object, the best way to store it is in its native mode -- as an object.
Were you to try and store an image in a relational database, it would be necessary to write conversion code. When you store the image, you convert it to something that you can keep in rows and columns. When you want to view it, you need to covert it back. It is far more efficient to store an image as an image in an object-oriented database. To do so otherwise is to invite system overhead, potential problems due to using additional conversion software, and less efficient use of your disk storage.
How the hybrid database works
The complicating factor in the move to objects is getting programmers' heads cranked around to the object way of thinking. This is notoriously difficult. The common view in the object community is that it takes special training and about six months to a year of application before the average programmer finally understands objects.
Again, this isn't anything new. Two decades ago, database programmers trained in schemas and hierarchies had to learn how to deal with tables, relational calculus, and the third normal form. But the fact that history is repeating itself doesn't make the learning curve any shallower.
The logical response is to try to have it both ways. Enter the object-relational model which tries to blend the object-oriented and relational databases. Usually this means trying to graft an object-oriented front end onto a relational database.
Typically, an object-relational database stores objects by automatically decomposing them into table entries. What the user sees looks like an object, but underneath it uses the features of the relational model. This allows for objects on the front end, where they are most visible to the users, while keeping the features of an RDB on the back end.
"Let's say you enter a new customer or take a new order and save the data," says Ram Todatry, president of Tangible Vision, a Downer's Grove, IL maker of Enterprise resource planning (ERP) software. "With an object-relational database you have to go through one extra step. You put the object through a disassembly process and save it into rows of several relational tables. When you go looking for the information, we grab those rows from the different tables."
Currently there are about a half-dozen object-relational vendors either selling complete DBMSs or object-oriented front ends that work with major RDBMS products such as those from Oracle, Sybase, and Informix. In addition, Oracle, Informix, and the other big RDBMS vendors are adding more object-type functionality to their products. For example, Informix and Oracle announced they will support the Common Object Request Broker (CORBA) and widespread use of Java in future releases of their products, and both of them have been arguing in public over which company is ahead in the race for objects and Web-enabling their products. Further, the new version of SQL, SQL3, which is still in development, includes a number of object features.
Whether the object-relational approach is a good idea or not depends on who you talk to. The relational people generally feel this approach combines the best of both worlds. The object people generally see it as crippling the power of the object model.
Tangible Vision chose to build its product on an object-relational database rather than a pure OODBMS. "We still feel that object-oriented databases aren't there yet for a couple of reasons," Todatry says. "Some of the relational databases we are dealing with are huge. We're talking gigabytes of data if not terabytes. We're not sure object databases can handle that yet. The other aspect is that there are about a dozen object database vendors today, and standards are just beginning to emerge. Hopefully it will settle down to where we can write a single program to run on multiple object database [management systems]."
Gemstone's Edwards doesn't agree, pointing out that the object-relational model loses some of the major advantages of object orientation, notably modularity and fast development time. Further, he says, there are some holes in the efforts. For example, he points out that SQL3 data classes are not true objects because not all the data types have all the features of objects. "In SQL3 you get polymorphism on some types. Other types don't have polymorphism, but they have inheritance. This means the developer has a very complex system to try to build, he says.
History repeating itself: The argument
against OODBMS vs. RDBMS
None of this is to say object-oriented databases aren't still controversial. Not everyone is sold on them, and most of the detractors point to the same defects. According to the critics, OODBMSs aren't as robust as relational databases, use more system resources, and are slower and can't handle large databases as effectively as RDBMS.
If you were in the database industry 20 years ago, all of this might sound hauntingly familiar. These are almost exactly the arguments that were used against relational databases when they first appeared. And as with object-oriented databases today, there was some truth in the claims.
The original relational databases were significantly slower than the hierarchical databases that preceded them. In those days they were not as capable at handling very large databases as the hierarchical databases. Also, the relational databases definitely did require more system resources than the hierarchical databases that dominated in the mid-1970s.
None of the hierarchial database arguments, no matter how appropriate, stopped the vast majority of users from switching from hierarchical to relational databases. As vendors improved their RDBMS offerings, the differences between relational and hierarchical shrank in every area except system resources. That was because the growth of computing power and the drop in memory prices meant we didn't have to tune RDBMSs to match the hierarchical databases' resource parsimony. Additionally, more and more organizations adopted the relational model for larger and larger products until eventually hierarchical databases were shoved into niches.
We can see many of the same processes starting with object-oriented databases today. An OODBMS, circa 1997, is a lot more powerful, more capable, and can handle bigger databases than an OODBMS of even two or three years ago. The performance gap between object-oriented and relational is narrowing considerably.
In the case of relational databases, the big selling points were ease of queries, particularly ad-hoc queries, and ease of programming, especially for queries and reports. In many ways, the relational database was a better fit for the emerging model of doing business where information needs were constantly shifting and impossible to define in advance. Users wanted to pull more precise information out of their mass of data.
For object-oriented databases, the big selling point is also a much better fit with the emerging model of doing business -- a model that employs multiple complex data types spread over distributed, decentralized systems to serve users who need even more precise, complicated kinds of information from the data.
"You get a lot of capability with relational technology," says Jim Althoff, vice president of software products for Aspect Development, Inc., Mountain View, CA., which developed an object-relational database system for its parts tracking software. "You can define entities and relationships and buy relational search engines from Oracle, Informix, etc. that do a good job on large amounts of data. You have standard query languages like SQL to do arbitrary queries."
"But what you don't normally get as part of a standard relational database product," he adds, "is the ability to easily set up a large, detailed class schema -- not establishing just entities, but classes and subclasses with search attributes or properties at each level and have them automatically inherited by the classes and subclasses below them. And you can't customize the behavior of objects by sending messages to the different classes."
Ironically, no one is predicting the complete demise of relational databases. Even the hard-core object vendors, such as Gemstone and Objectivity, see a role for relational databases. They admit that for some kinds of applications they are superior, just as relational databases never completely replaced hierarchical databases.
About the author
Rick Cook is a regular columnist in our sister publication, NetscapeWorld. He divides his time between writing about the Web, computers and high technology, and novels. Reach Rick at firstname.lastname@example.org.
If you have technical problems with this magazine, contact email@example.com