Product Focus: Database Technologies *

Investigating a new database management system? Consider interoperability and standards.

by / January 31, 1997 0
There are so many trends, subtrends and new features in the database world, that it's difficult to get a handle on who's doing what, how they're doing it and why. Gone are the simple, halcyon days of mainframe dominance and IT supremacy. Client/server and open systems mean more choices, more flexible systems -- and more levels of complexity. For anyone investigating or choosing a database management system (DBMS), the new computing world means database decisions based solely on the database software or its vendor may fall short. Interoperability, conformance to network and database standards, supported hardware platforms, operating systems, network operating systems, Web support, third-party tools and a host of other issues must be considered when evaluating any of the current database players.

Even the type of database must be considered. Most of the big players are relational, but object databases are beginning to stake out a share of the market -- although even the definition of "object databases" is open to differing interpretations.

Most of the major relational database makers have added, or are adding, object extensions to their products, so they can handle user-defined and binary datatypes such as pictures, video, faxes, etc. However, this doesn't mean that the database itself is object-oriented. True object-oriented database management systems (ODBMS) handle various object types and are themselves object oriented. Some of the ODBMSes available today are Versant, Servio Corp.'s Gemstone, Object Design's Object Store, Ontos' ONTOS DB, Hewlett-Packard's Open ODB and UniSQL's UniSQL/X. For simplicity's sake, this article does not examine any of the object-oriented database offerings.

SQL
To understand how far the database debate has progressed, it is important to understand something of recent database history, SQL in particular.

SQL (Structured Query Language) began life in the 1970s as an IBM English-like interface to databases. It was based on the data modeling work of E.F. Codd and was intended to be a descriptive, nonprocedural language. In short, this meant SQL was used to describe what information was wanted, and the SQL server or engine would decide how best to extract that information and present it in a standard format. In contrast, procedural languages -- such as C, BASIC or Pascal -- require the programmer to instruct the computer on each step needed to accomplish a task.

In the years since its original development, three ISO SQL standards have been issued. According to Orfali, Harkey and Edwards (see "more information" sidebar), the first standard was issued in 1986 and was revised in 1989. Now called SQL-89, it was an intersection of the SQL implementations of that time, which made it easy for existing products to conform. Because of the low threshold, compliance with this standard meant little.

SQL-92 extended the SQL-89 standard and implemented three stages of conformance -- entry, intermediate and full. Most of the big database vendors have implemented some of the SQL-92 specifications, but none have implemen-ted the entire standard. Martin Rennhackkamp (see "more information" sidebar), in an introduction to a recent series of database comparison articles, noted at least two major areas in which many of the big databases have failed to implement the SQL-92 standard.

Although full adoption of the SQL-92 standard is not yet in sight, an even newer standard -- SQL3 -- is already being worked over. The SQL3 standard weighs in at a daunting 1,000 pages and is broken down into seven parts, each of which is being considered separately. Expected adoption of this standard remains years away, yet many of the issues it addresses are already being tackled by database vendors. Intense vendor competition combined with the no-holds-barred rush of developments in client/server and Internet technology is forcing database vendors to move with the times or be left behind.

As a result, proprietary extensions to SQL have been introduced by almost every vendor. Although these extensions are helping to push database performance to new heights and are extending the reach of traditional databases, each extension also makes interoperability more difficult in heterogeneous environments. It is therefore vital for managers who are investigating database options to take into account what else already exists on their networks and what it will take to get existing information sources to play happily with any newcomers.

MIDDLEWARE
These interoperability difficulties have spawned an entirely new layer of software -- and in some cases entire industries -- to produce the middleware needed to tie disparate databases together. A multitude of middleware choices exist. Some middleware works by serving as a glue to keep two or more relational DBMSes talking. This type of middleware generally works by translating one SQL dialect into another, so vendor-specific versions are needed for each database on the network. Other solutions work by establishing standards for which database vendors provide interfaces. Instead of translating one version of SQL into another, each database learns the common language. Fortunately, several standards have emerged which most of the major vendors have adopted.

While both of these approaches are helpful, they often represent a lowest common denominator. When this happens, some of the advantages which come with the most advanced, nonstandard features of a particular database may be lost.

Another approach to the interoperability issue is to control access to a multitude of databases from one master program which hides the translation and conversion details needed to get data from existing sources. Information Builders Inc.'s EDA/SQL is a gateway program that converses with more than 50 databases -- relational and nonrelational. Enterworks Inc. has extended this concept with its Virtual DB product.

It is important that IT managers and planners also examine all the middleware issues whenever investigating a particular database solution. All of the major database vendors will claim interoperability, and in general the claims are true -- for certain subsets of SQL commands. Because of the limits of middleware, each claim needs to be examined against the existing operating environment and future development plans.

PROCEDURAL LANGUAGES
As mentioned earlier, SQL began life as a descriptive language. But, due at least in part to its success, ways have been developed to use SQL in procedural contexts. Two major approaches exist:

Database-specific procedural languages which include SQL commands.
Extensions to existing procedural languages, such as C, which allow SQL statements to be used inside the procedural language's code. This is generally called "embedded SQL."
Most major database vendors provide both their own language and the ability to make SQL calls from other procedural languages. However, as you might guess, implementations are not necessarily compatible.

STORED PROCEDURES
Another common extension, often based on the database's procedural language, is stored procedures. These are essentially modules of precompiled SQL and procedural language code which reside on the server and may be called by a client. Stored procedures reduce network traffic (clients can make one call to the stored procedure instead of having to send all of the commands over the network) and they speed up execution (code is precompiled so it doesn't need to be recompiled every time it's run). These advantages are particularly important when developing online transaction processing (OLTP) applications such as airline reservations systems, bank teller applications or other realtime order-entry systems that require extremely fast execution.

In a similar vein, database vendors have almost uniformly implemented "triggers." These are similar to stored procedures, except that they are event-driven, rather than being called from application code. For example, an application designer may want to require that whenever a new order is entered, a corresponding due-date field is automatically filled in. On databases which support triggers, the due-date field could be updated automatically by a trigger that fires each time a new order is entered. Unfortunately, trigger implementation can vary greatly from one vendor to another, so it is important to understand how it is done and, more importantly, what your application will require.

NEW ROLES
Even more sweeping than the addition of stored procedures or triggers is the tendency to press databases into entirely new roles, either by building on top of existing functionality or offering add-on packages that extend the role of the DBMS. For example, most leading databases have been optimized for OLTP work, but vendors are now pressing them into use for online analytical processing (OLAP) which requires fast access to summarized data.

Relational databases store information in tables consisting of related rows and columns. This data model is fast and efficient for managing information such as mailing lists, account records and purchases where the format of the data and its relationships are known. However, the relational model is too slow for OLAP because an online analyst may want to look at the data from many different angles.

To accommodate this different purpose, OLAP databases often create many different indexes to speed ad hoc queries. In fact, an entirely new data model, called a "multidimensional" database -- which stores information in a way that simplifies access from many different dimensions or indexes -- is gaining popularity.

Not to be outdone, most of the leading relational database vendors are adding OLAP capabilities to their offerings. Some are doing this by putting new layers of abstraction onto an existing relational database -- essentially creating a multidimensional view of underlying relational data. Others are offering SQL databases optimized for OLAP work. In the future, others may offer multidimensional databases side by side with existing relational databases.

DATABASE VENDORS
According to Orfali, Harkey and Edwards, more than 200 companies offer SQL products on a wide range of computers. The total SQL market stood at $5.9 billion in 1995 and is growing at an annual rate of 30 percent, they said. Perhaps even more telling is their estimate that "less than 10 percent of the world's data is stored in relational databases," leaving a rather large potential for market growth.

Some industry watchers feel that the next several years will see a thinning of database vendors, with just a handful of large ones surviving. Although the trend toward consolidation seems clear, it is not a forgone conclusion who will make it through the thinning.

Following is an overview of some of the top companies:

ORACLE
In the late 1970s, Oracle was the first company to offer a commercial SQL-based product. It is currently the market leader with 1994 revenues for its database product topping more than $1.1 billion (total company revenues exceed $4 billion). Its database operates on more than 90 platforms (including 60 flavors of UNIX), and its latest release, Oracle 7.3, announced in early 1996, offers a wide range of add-ons and extensions to SQL. Oracle is also offering a Universal Server, effectively extending the concept of a database to all types of computer-storable information.

According to Rennhackkamp, "The aim of the Oracle Universal Server is to combine the functionality of the world's best-selling client/server relational DBMS with complete Web, text management, messaging and multimedia information servers. The multimedia services include fully integrated relational, spatial, text, audio and video information to any user with a standard Web browser accessing a Web-enabled database."

SYBASE
Although well-respected technologically for many years, Sybase had problems with its System 10 release. System 11, released in late 1995 and produced under the controls of an ISO 9000 certification process, seems to be reversing the customer relations and PR damage caused by System 10.

Sybase provides powerful stored procedures and triggers and has its own SQL procedural language, Transact-SQL, which is based on the SQL-92 standard and includes Sybase extensions.
databases, each extension also makes interoperability more difficult in heterogeneous environments. It is therefore vital for managers who are investigating database options to take into account what else already exists on their networks and what it will take to get existing information sources to play happily with any newcomers.

MIDDLEWARE
These interoperability difficulties have spawned an entirely new layer of software -- and in some cases entire industries -- to produce the middleware needed to tie disparate databases together. A multitude of middleware choices exist. Some middleware works by serving as a glue to keep two or more relational DBMSes talking. This type of middleware generally works by translating one SQL dialect into another, so vendor-specific versions are needed for each database on the network. Other solutions work by establishing standards for which database vendors provide interfaces. Instead of translating one version of SQL into another, each database learns the common language. Fortunately, several standards have emerged which most of the major vendors have adopted.

While both of these approaches are helpful, they often represent a lowest common denominator. When this happens, some of the advantages which come with the most advanced, nonstandard features of a particular database may be lost.

Another approach to the interoperability issue is to control access to a multitude of databases from one master program which hides the translation and conversion details needed to get data from existing sources. Information Builders Inc.'s EDA/SQL is a gateway program that converses with more than 50 databases -- relational and nonrelational. Enterworks Inc. has extended this concept with its Virtual DB product.

It is important that IT managers and planners also examine all the middleware issues whenever investigating a particular database solution. All of the major database vendors will claim interoperability, and in general the claims are true -- for certain subsets of SQL commands. Because of the limits of middleware, each claim needs to be examined against the existing operating environment and future development plans.

PROCEDURAL LANGUAGES
As mentioned earlier, SQL began life as a descriptive language. But, due at least in part to its success, ways have been developed to use SQL in procedural contexts. Two major approaches exist:

Database-specific procedural languages which include SQL commands.
Extensions to existing procedural languages, such as C, which allow SQL statements to be used inside the procedural language's code. This is generally called "embedded SQL."
Most major database vendors provide both their own language and the ability to make SQL calls from other procedural languages. However, as you might guess, implementations are not necessarily compatible.

STORED PROCEDURES
Another common extension, often based on the database's procedural language, is stored procedures. These are essentially modules of precompiled SQL and procedural language code which reside on the server and may be called by a client. Stored procedures reduce network traffic (clients can make one call to the stored procedure instead of having to send all of the commands over the network) and they speed up execution (code is precompiled so it doesn't need to be recompiled every time it's run). These advantages are particularly important when developing online transaction processing (OLTP) applications such as airline reservations systems, bank teller applications or other realtime order-entry systems that require extremely fast execution.

In a similar vein, database vendors have almost uniformly implemented "triggers." These are similar to stored procedures, except that they are event-driven, rather than being called from application code. For example, an application designer may want to require that whenever a new order is entered, a corresponding due-date field is automatically filled in. On databases which support triggers, the due-date field could be updated automatically by a trigger that fires each time a new order is entered. Unfortunately, trigger implementation can vary greatly from one vendor to another, so it is important to understand how it is done and, more importantly, what your application will require.

NEW ROLES
Even more sweeping than the addition of stored procedures or triggers is the tendency to press databases into entirely new roles, either by building on top of existing functionality or offering add-on packages that extend the role of the DBMS. For example, most leading databases have been optimized for OLTP work, but vendors are now pressing them into use for online analytical processing (OLAP) which requires fast access to summarized data.

Relational databases store information in tables consisting of related rows and columns. This data model is fast and efficient for managing information such as mailing lists, account records and purchases where the format of the data and its relationships are known. However, the relational model is too slow for OLAP because an online analyst may want to look at the data from many different angles.

To accommodate this different purpose, OLAP databases often create many different indexes to speed ad hoc queries. In fact, an entirely new data model, called a "multidimensional" database -- which stores information in a way that simplifies access from many different dimensions or indexes -- is gaining popularity.

Not to be outdone, most of the leading relational database vendors are adding OLAP capabilities to their offerings. Some are doing this by putting new layers of abstraction onto an existing relational database -- essentially creating a multidimensional view of underlying relational data. Others are offering SQL databases optimized for OLAP work. In the future, others may offer multidimensional databases side by side with existing relational databases.

DATABASE VENDORS
According to Orfali, Harkey and Edwards, more than 200 companies offer SQL products on a wide range of computers. The total SQL market stood at $5.9 billion in 1995 and is growing at an annual rate of 30 percent, they said. Perhaps even more telling is their estimate that "less than 10 percent of the world's data is stored in relational databases," leaving a rather large potential for market growth.

Some industry watchers feel that the next several years will see a thinning of database vendors, with just a handful of large ones surviving. Although the trend toward consolidation seems clear, it is not a forgone conclusion who will make it through the thinning.

Following is an overview of some of the top companies:

ORACLE
In the late 1970s, Oracle was the first company to offer a commercial SQL-based product. It is currently the market leader with 1994 revenues for its database product topping more than $1.1 billion (total company revenues exceed $4 billion). Its database operates on more than 90 platforms (including 60 flavors of UNIX), and its latest release, Oracle 7.3, announced in early 1996, offers a wide range of add-ons and extensions to SQL. Oracle is also offering a Universal Server, effectively extending the concept of a database to all types of computer-storable information.

According to Rennhackkamp, "The aim of the Oracle Universal Server is to combine the functionality of the world's best-selling client/server relational DBMS with complete Web, text management, messaging and multimedia information servers. The multimedia services include fully integrated relational, spatial, text, audio and video information to any user with a standard Web browser accessing a Web-enabled database."

SYBASE
Although well-respected technologically for many years, Sybase had problems with its System 10 release. System 11, released in late 1995 and produced under the controls of an ISO 9000 certification process, seems to be reversing the customer relations and PR damage caused by System 10.

Sybase provides powerful stored procedures and triggers and has its own SQL procedural language, Transact-SQL, which is based on the SQL-92 standard and includes Sybase extensions.

Sybase also has a good record for connectivity. According to Rennhackkamp, "Sybase has always been known for its extensive connectivity. The Sybase Enterprise Connect family of products provides access to a large set of databases through a heterogeneous connectivity middleware layer called the Sybase Open Client/Open Server architecture. All of the Sybase products communicate through the Open Client/Open Server API, which results in a very open and 'connectable' architecture."

System 11 also expanded on Sybase's move into the data warehousing market, and its acquisition of the Watcom database (which Sybase acquired when it bought PowerSoft) gives it an "in" to the low-end market.

IBM
IBM's DB2 began as a mainframe application, but is now available on everything from mainframes to the desktop. It, too, supports triggers and stored procedures, but its stored procedures may be coded in "any language that creates a DLL (Dynamic Link Library) and supports an SQL precompiler." This means that developers can create stored procedures using the tools and languages they are accustomed to rather than learning a propriety DBMS language.

IBM is also investing heavily in interoperability through its Distributed Relational Database Architecture (DRDA), which is both a model for distributing data across a network and a set of protocols for accessing and using that data. A long list of other database vendors have already signed on to DRDA, and IBM's goal is to provide an interoperability standard for fully distributed heterogeneous relational database environments.

INFORMIX
Informix is the fastest growing database company in the UNIX market, with annual database license sales growing at 50 percent. Its flagship database engine, the OnLine Dynamic Server, is multi-threaded and allows queries to be processed in parallel.

It has its own procedural language and also supplies preprocessors for embedded SQL. Triggers are supported, as are stored procedures which use Informix's procedural language, and can make system calls to alert administrators to database events.

Informix's relatively recent merger with Illustra Information Technologies gives it access to technology from the company that pioneered object extensions on top of normal SQL database engines. Informix is combining Illustra's technology with their own to produce a universal server capable of serving complex data types including text, image, spacial and sound.

MICROSOFT
Microsoft arrived late to the group of big names in high-end databases, having already established a foothold in the lower-end market with Access and Visual FoxPro. Microsoft SQL Server was originally a Sybase product which was available from Microsoft for OS/2 and Windows NT. In 1994 Microsoft broke with Sybase and began to develop SQL Server on its own. Microsoft has been adding features to make it competitive with the established high-end databases and keeping the price low to make it attractive.

Stored procedures can be defined using either Sybase's procedural language, Transact-SQL, or as functions in external dynamic link libraries. This gives database administrators and users a great deal of flexibility.

Microsoft SQL Server is the only database discussed which is not offered on a variety of platforms -- it only runs on Windows NT. However, limiting the SQL Server to one platform means it can take advantage of underlying operating system services to enhance performance. Microsoft has also provided fast interfaces, allowing other applications to access SQL Server-based data easily.

SCRATCHING THE SURFACE
This article has not touched on database support for the Internet. Yet virtually all the databases mentioned incorporate Internet support, either by providing access through user-supplied scripts which can make SQL calls, or through database services that format data and return Web-ready data.

Nor has it mentioned replication, which is an important element in moving mission-critical applications to the client/server environment.

All of the major database vendors are frantically adding features and making speed and efficiency modifications. Orfali, Harkey and Edwards even commented that the major databases are moving toward becoming "portable operating systems." Whether or not the major databases evolve to the point of being their own operating systems, they will continue to evolve. Users and database administrators will no doubt benefit from the multitude of new features and enhanced performance. However, they will need to evaluate not only the databases, but how well they play with the operating system, the network, and other system and application software.

*


--------------------------------------------------------------------------------

Database Future
During the past five years, the DBMS world has changed considerably. By 2000, the "Big Five" DBMS vendors will be IBM, Informix, Microsoft, Oracle and Sybase. The next tier will include Computer Associates International (because of Ingres) and Progress Software.

Also, most IS departments will be unable to mandate a single DBMS for their enterprises. All DBMS strategies will include more than one vendor due to legacy systems, new applications, and changes in enterprises' DBMS strategies. Consequently, the average IS department will have five or more DBMS engines to support.

Additional turmoil in the database market will occur through further consolidations. One major consolidation will be the merger of the object DBMS market and the RDBMS market during the next five years. The biggest winners from the market turmoil will be the largest vendors -- IBM, Microsoft and Oracle -- because of their credibility and perceived longevity.

Source: Gartner Group

*


--------------------------------------------------------------------------------

More Information
The following sources were particularly useful in putting together this article, and should be helpful in obtaining a more complete picture of the subject.

The Essential Client/Server Survival Guide, Second Edition, by Orfali, Harkey and Edwards. Wiley Computer Publishing, John Wiley & Sons Inc., 1996.
DBMS Magazine. In particular, the November 1996 edition has an in-depth comparison of the major database players. The comparison, by Martin Rennhackkamp, is available online at .
DBMS World home page: . This site includes hundreds of pointers to information on a wide variety of databases, including links to the home pages of all the databases described in this story.
The Computer Desktop Encyclopedia, by Alan Freedman, published by The Computer Language Co. A demo with examples may be downloaded at .
*





[ February Table of Contents]
David Aden
David Aden DAden@webworldtech.com is a writer from Washington, D.C.