Subtleware: C++ persistence

Subtleware condenses C++ object vapors into relational data.

By Bill Rosenblatt

Most programmers agree that C++ currently is the serious language of choice for the client side of client/ server software development. In the dominant development paradigm, programmers add SQL to their C++ code to query the relational-database server. The interaction between C++ and SQL leaves a lot to be desired, but fortunately there are a few products that improve the fit. One of them is Subtleware for C++ from Subtle Software (Billerica, MA).

C++ is incompatible with SQL in many different ways. Since C++ is an object-capable language, there is an "impedance mismatch" between its object-at-a-time, pointer-chasing object model, and the set-at-a-time relational model of SQL. There's also the SQL mentality of limiting the programmer's computational power in order to support database features like referential integrity and fast associative retrieval, versus the C++ mentality of unlimited programming power.

If we look specifically at C++, there's a wide gulf between it and client/server in general: the notion of "persistence." Persistence is the ability of data objects to exist beyond the program that created them and to remain accessible to other programs long after the creator program ceases. Think about this for a moment: Data persistence is what databases are all about, but there's no reason why it should be the exclusive province of databases. Programming languages can be persistent, too. It's just that C++ isn't.

Here today, here tomorrow
While I'm not certain of the origins of the term persistence, I believe it dates back to the early 1980s, even though plenty of programming languages had persistence built in long before the term was coined. LISP and APL are two examples. Smalltalk is a persistent object-oriented language, as are object-oriented LISP dialects like Common Lisp Object System (CLOS). Both languages provide persistence by means of a workspace, which is an environment containing functions, variables, and other program objects on which a programmer works. The workspace is preserved on disk, usually as a file, that the programmer can call up at any time and readily pick up from where they left off. By contrast, conventional languages like C++ have runtime environments that disappear without a trace when a program exits, leaving no way of recovering the program's in-memory data.

Persistence seems to be an obviously desirable feature, but providing it in a language leads to major complications. It's ultimately a design choice about the size and complexity of the language's runtime environment. For example, a language cannot support persistence if it's meant to be independent of any particular operating environment and therefore doesn't make any assumptions about underlying operating system and storage features. C, of course, is such a system-independent language, and so is its successor C++.

There are a lot of good reasons for adding persistence to hugely popular languages like C and C++. The question is, how? The simplest way is to write data you want to preserve to a disk file and then read them back in again the next time the program runs. You can do this in any language with basic I/O instructions. But even this simple technique is problematic. It involves developing code that translates ad hoc data from an internal format into that used by the external file and back again. All that is time-consuming and prone to error.

Another fairly simple way to implement persistence is to add some minimal capability to the language itself. Pascal, for example, provides the FILE OF datatype declaration, which lets you read and write to a file in units of a logical datatype, such as an integer or a record (analogous to a C struct). This eliminates the external-internal translation step, but it requires the programmer to explicitly read and write data objects to and from a sequential file and does not allow random access to the data.

Pushy yet classy
Object-oriented languages provide inheritance, which leads to a more convenient way to add persistence. You can build (or buy) an object class that encapsulates the grungework of disk-storage management needed to make an object persistent. Then you can use this persistence class by inheriting it into the definitions of your own application classes. Persistence classes work in two general ways: Either they make all objects of a given class inherently persistent when created (the "class-based" persistence method), or they include a method that, when called, makes an individual object persistent (the "instance-based" method).

Persistence classes help make life easy for C++ applications programmers. The hard part is building the implementation of the persistence class -- the storage management necessary to make it work underneath that nice clean abstract interface. Tool vendors have taken various approaches to this problem.

At one extreme, you can slide a full-blown C++-based object-oriented database-management system (OODBMS) under your application program. Client/ server OODBMS like Object Design's ObjectStore (Burlington, MA) serve this purpose admirably. But if you are developing a small, nonmission-critical application that merely requires persistence for a few objects, it's probably not worth investing in a full OODBMS if you haven't already.

Another approach is to build a small object storage-management component that provides sufficient functionality, even though it lacks the industrial-strength features of a high-end database. Two available tools that fit this niche are Poet from Poet Software (Santa Clara, CA) and Raima Object Manager from Raima Corp. (Issaqua, WA). We'll examine these products in more detail next month.

However, what if your organization -- like so many others these days -- has a relational database? Why not use your RDBMS as the storage manager for persistent C++ objects? Subtle Software's Subtleware for C++ makes this third approach possible. Subtleware supports C++ object persistence in the popular Unix databases from Oracle and Sybase, and in the Watcom Windows-based databases that Powersoft is shiping with its Powerbuilder application development tool. There is also a version of Subtleware for ODBC, the database compatibility standard from Microsoft. The Subtleware toolset runs on MS-DOS/Windows, SunOS, and HP-UX platforms.

Passive persistence
Subtleware implements persistence in a passive way. It maintains two versions of a data object: the normal, in-memory object and a copy stored in the relational database. It makes the programmer explicitly call Subtleware built-in functions to keep the two versions in sync. These functions, including write, get, and destroy, are methods of the persistence class sBridge that application classes must inherit. You call the write method to make an object persistent (Subtleware uses the instance-based method), and you must call write again when the object has changed.

A much cleaner way than this passive method would be to do away with the write method entirely, by automatically tracking objects' changes and letting the programmer make an object persist when it is created (for example, by adding another constructor function with an extra argument to sBridge). Perhaps in a future release, Subtle Software?

Apart from the need to write objects explicitly to the database every time they change, using Subtleware is a straightforward experience. To make objects in your program persist, you first need to modify your class definitions so they inherit sBridge. You also need to add a data member to each class that serves as a key, which lets you retrieve objects of that class from the database during future runs of your program. A few other minor changes to your source code also are necessary.

The next step involves SGEN, the Subtleware code generator. SGEN preprocesses C++ code and produces SQL code for generating tables that correspond to your application classes and manipulating the data. SGEN code represents Subtleware's approach to the impedance mismatch between objects and relations mentioned above. As with most objects-on-relations products, SGEN maps object classes to relational tables so individual objects map to rows. This presents no problem at all, as long as a C++ class definition contains data members of simple, built-in datatypes.

But as the C++ class definitions get more complex, the classes-to-tables mapping starts to break down. If a class definition contains a data member that is an "object type" (an object within another object), Subtleware creates a table definition with columns including those of a data member. For example, if you have an "employee" class definition that contains a data member called "title" that is itself another object type, perhaps containing the number, name, salary range, and so on of the employee, then Subtleware will create a table for that class with all of the employee columns plus the title columns. This is like doing an implicit join on the two tables, which is the most efficient possible scheme, but it destroys the logical separation of the classes. Preserving the logical structure isn't all that important, because the underlying RDBMS is merely acting as a low-level storage manager.

As the class definitions in a C++ program get more complex, Subtleware requires extra programming to integrate them with the underlying relational database. It's typical to have objects pointing to other objects rather than objects embedded within other objects. Subtleware has to create a persistent pointer to another relational table to keep track of that relationship. It provides persistent pointer classes, but programmers must use them explicitly in their code; SGEN can't automatically translate regular C++ pointers to persistent pointers.

Subtleware also includes collection classes, which are closely analogous to those provided with C++ class libraries like those from Rogue Wave Software (Corvalis, OR) and other vendors, but which also include support for persistence. Subtleware needs to provide these classes since it can't take Rogue Wave's classes (for example) and automatically make them persistent.

The final step in using Subtleware involves its other major built-in class, sDBMgr, which contains methods for database session control and transaction management. For most programs, you simply need to put a few calls to sDBMgr methods like connect, transaction, and disconnect in your main program to log on to the database server, begin and end transactions, etc.

Fits in, subtly
Subtleware can be a very useful way to leverage C++'s features and popularity when building small, nondata-intensive applications or when adding persistence to existing programs in a client/server environment. Although it also contains features for querying relational tables and populating C++ classes with the results, it's not really designed for working with existing relational data.

Subtleware occupies a unique niche in the market by adding persistence to C++ in a way that avoids programmer headaches. Its goal is to provide persistence as invisibly -- or as subtly -- as possible. Fundamental differences between the systems-programming roots of C++ and the core-data repository pedigree of relational databases may make this goal elusive, but it will be interesting to see how far they get.

About the author
Contributing editor Bill Rosenblatt is director of Publishing Systems in the New York City office of the Times Mirror Co. Reach him at bill.rosenblatt@sunworld.com.

[Amazon.com Books] Bill Rosenblatt is the author of Learning the Korn Shell and a coauthor of Learning Gnu Emacs and Learning the Bash Shell. You can buy these at Amazon.com Books. Select the hyperlinks to learn more about each and Amazon.com.

A list of Bill Rosenblatt's Client/Server columns in SunWorld Online.

If you have problems with this magazine, contact webmaster@sunworld.com
URL: http://www.sunworld.com/asm-03-1995/asm-03-client.html.
Last updated: 1 March 1995.