Virtual Database Gathers Scattered Data

"Virtual DB" may be a solution or state and local government agencies which need to access scattered data and present it in a unified way.

by / January 31, 1997
Many pleasant words are used to describe information infrastructures -- "heterogeneous environment," "mixed platforms," or "multiple architectures." But to be honest, when information is needed from several offices or departments, most infrastructures are best described as a "mess."

Legacy systems here, midrange boxes there and PCs everywhere. Even putting all the hardware on the same network doesn't get the information out and into a unified view. Some of the information is available through distributed file systems; some is obtained from homegrown scripts; then there's the information that's collected by e-mailing Betty in bookkeeping and asking her to "please print the monthly report -- you know the one you do every month at this time -- and e-mail it back to me."

Client/server technology was supposed to handle the problem, but it created another system. The whole structure should be redesigned from the bottom up or top down, but who has the time or money to step back and do it? Even worse, technology is changing so fast that planning often seems outdated before it's approved, let alone put into action.

This may be overstating the problem, but it's not uncommon to find information infrastructures with many data sources, each of which contains a piece of the puzzle needed to put together a complete picture. The problem for IT is to get at the data, translate it into a common format and present it in a coordinated, timely manner. Datawarehousing has been one solution, but datawarehouses also have disadvantages.

According to Kevin Strange of the GartnerGroup, datawarehouses require tremendous upfront planning, cannot be expected to show results for seven to 12 months and may cost upwards of $10 million dollars (see Government Technology, Emerging Technology Handbook, June 1996). What's more, new data sources are coming online continuously, whether it's the new accounting package in the billing department or another agency's Web site. Adding a data source to a datawarehouse can take time.

Because datawarehouses work by taking "snapshots" of existing data and replicating it into a central repository, their internal structures are often fixed. They need to be designed in great detail before implementation because redesigning and re-populating a datawarehouse after it's been filled with several years of data can be very time-consuming and expensive.

Enterworks Inc. is taking a new approach with their Virtual DB product. Instead of replicating existing data into a datawarehouse, Virtual DB builds a "virtual database," using existing sources of information as building blocks.

"It [Virtual DB] started in the intelligence community for the National Security Agency, whose information infrastructure is not unlike everyone else's out there," said Bob Lewis, president of Enterworks. "They had a whole variety of different systems and they wanted to unite that information. We built Virtual DB with them in mind, in response to their initial requirement, but it was funded by Telos."

The first step in building a virtual database is to tell Virtual DB where to find data sources. It goes out across the Net and maps the existing systems, gathering information on the tables, fields and permissions that exist in other running databases such as Oracle, Sybase and Informix, or in text-based files. It summarizes the types of information in these disparate sources in a "meta-catalogue" which is just a map of what data exists and how to get at it. For example, the meta-catalogue might show that the city clerk's Sybase database contains fields for first and last name, address and political party registration. It would also show that the property tax office's Ingres database has fields for first and last name, plot number, year purchased, assessed value and outstanding tax balance. When the database administrator looks at the meta-catalogue after this data-gathering phase, he sees a map of "what's out there." This by itself can be very helpful, but the power lies in the ability to combine information from any of these sources into new views.

"You can create new objects. If you have some information from the billing system in IMS [Information Management System -- IBM's mainframe database management system] and you want to marry it with information in an Oracle database, Virtual DB allows you to create this new view of the information," said Lewis. "You can then build applications such as a Web-based application that interacts with those new objects. The user goes into a Web application and asks for a status on such-and-such an account. Virtual DB will break it down, get the information from the mainframe and the Oracle database and present it."

Although Virtual DB is fairly new, it has made inroads into the commercial and federal markets. For example, Northrup Grumman is using it as part of a contract to deliver technical information electronically on the B2 stealth bomber. The Air Force chose Web browsers as the universal method for accessing the data, but behind the scenes Virtual DB is used as part of the solution to get at the scattered data.

"The first phase is to deliver engineering parts list and 3-D and 2-D drawings," said Denys Mueller, a project manager for Northrop Grumman. "So if I am a maintenance guy, I can go out and bring up that part of the drawing and figure out how I am going to fix it. The second phase of the contact involves going after all types of data."

The available B2 data will eventually include any data associated with the B2, including design drawings, parts lists, maintenance processes and repair procedures. Once the information is available electronically, Northrop Grumman will work with the Air Force to use the data to reduce program costs and improve the development, maintenance and service processes.

"We'll be doing some of the business process improvements, looking at how changes are processed, how procurement is done," said Mueller. "We'll see how we can apply [the data] in those areas and see how the structure can be redone as needed."

With a traditional datawarehouse, "iterative" design would have been difficult or impossible. With a virtual datawarehouse, business processes can be streamlined to take advantage of the data structures which in turn can be adjusted to more closely mirror the streamlined business processes. Although Mueller knows that presenting a unified view of such a large amount of data is not any easy proposition no matter what tools are used, the virtual database model gives the best chance for success and flexibility.

In fact, a study panel undertaken by Northrop Grumman and the Air Force estimated that the implementation of the virtual database will result in a minimum $536 million savings over the life of the B2 contract. What's more, if the virtual database works as planned, the Air Force hopes to combine that information with data on the B1, allowing them to reap even more savings by identifying common parts across both programs, thereby benefiting from greater economies of scale.

For most IT shops, the problem is more than just how to get at the data that's out there, they also have to worry about getting at the new data that will be there tomorrow. In both commercial and government settings, changing organizational patterns are accelerating the development of new data structures which only add to the problem.

"In most organizations, both governmental and commercial, you have a vertical orientation, so you built applications -- accounting, billing, inventory management systems -- that serve that specific function or organization," noted Lewis. "But a big trend now is that organizations are flattening and pushing responsibility down. There is also a trend toward cross-functional teams. Part of what people have been doing [to accommodate these changes] is datawarehousing. They're bringing in data from a variety of sources and replicating it. With Virtual DB we give you the ability in a much faster time frame to build a virtual warehouse."

Whether or not a datawarehouse is planned, the use of a virtual database may have many applications in law enforcement, human services and other agencies which need to access scattered data and would like to present it in a unified way. It may also prove to be a powerful and relatively easy way to build "one-stop shopping" interfaces for state and local governments. Instead of asking the IT staffs of many individual agencies to spend months of planning and development time working out how to put their information into a common format, developers may be able to use a virtual database to pull the information from those sources, translate it as needed and combine it into a form that can be presented through a single interface. Development time and effort could shift from back-end data compatibility issues to development that really counts -- providing citizens with easy-to-use, complete interfaces for all the business they have with state and local governments.


[ February Table of Contents]
David Aden
David Aden is a writer from Washington, D.C.