Since its inception in 1992, the AHCA has added Medicaid, the State Employees' Group Health Insurance Plan, health purchasing, hospital and medical licensing and a host of other health-related activities to its roster of new functions. During that time, AHCA's Division of Management Information Services (MIS) has seen its computing infrastructure grow from three file servers and 300 PCs running Novell's NetWare operating system to more than 30 servers and 2000 PCs networked on Windows NT.
At the same time, MIS has taken over responsibility for a host of mainframe and mini-computer databases. Realizing that the limitations of the older database applications would not fit with the highly integrated, rapidly evolving business of healthcare at AHCA, Edwin Lott, director of MIS, launched a strategic plan to migrate mission-critical data over to a relational database management system.
Using Informix 7.1 as its new enterprise standard for databases, the agency has quickly seen several significant improvements -- most notably database performance -- thanks to Informix's ability to run queries in parallel. "The result," said Lott, "is that we can execute in less than three hours a job that used to take 70 hours to perform."
DATA ANALYSIS DEMANDS
Nobody is surprised that government databases are growing and will continue to grow. What is surprising is the increasing demand to analyze data that once sat inside mainframes and other computers.
Getting to the information and extracting intelligence from it has always been a problem. Traditional queries run on mainframes cost time and money, said Lott, so analysts could only justify executing the most important requests for database processing. As a result, complex database queries were laborious to perform and their impact on decision support was sometimes debatable. But as state and local government has changed in recent years, so have requirements for data transactions and decision support. Whether it's claims processing in
healthcare, identifying suspects in law enforcement or analyzing trends in environmental protection, government agencies now require greater access
to data and superior database performance.
Lott's division has to deliver on a key objective if AHCA is to fulfill its mandate. "It's our responsibility to deliver public information," he said. "To do that, we need the ability to access databases anywhere on our network and provide quick output."
Flexibility is another thing government agencies increasingly want. The demand for new applications, involving more complex information management, is rising. Databases running on COBOL are too inflexible to serve these new applications, said Raymond Vigil, deputy director for technology and emergency support in the New Mexico Department of Public Safety (NMDPS). Vigil's agency needs to develop new applications that are effective in identifying criminals and fighting crime on a statewide basis. "To do that we need databases that can change on the fly and that can allow us to share information more readily," he said.
Fortunately for MIS, relational database systems have matured to the point where they can now run mainframe-sized, multi-gigabyte databases on distributed platforms, such as client/server. These systems give MIS the performance and flexibility they need, while giving users the freedom to use friendly graphical interfaces to perform their work. Relational databases -- which can take two or more files and generate a new file from the records that meet the matching criteria -- have gained great popularity for accommodating a user's ad hoc request for selected data.
Since gaining acceptance in the 1980s, relational database management systems have grown to include a core set of functions, which all the major vendors -- Oracle, IBM, Informix and Sybase -- support.
According to Judy Davis, an independent database consultant based in Lexington, Mass., today's relational database functions include using client/server as the mainstream system architecture and structured query language (SQL) to process data. Relational databases also support Microsoft's Open Data Base Connectivity (ODBC) specification. Davis described ODBC as a standard way to get clients and servers to talk to each other. "ODBC is a way of writing applications that are database independent, so you can talk with different databases," she said. "ODBC has been real important in terms of relational database development."
A fourth function that has become standard with relational databases is support for multiprocessing servers. Multiprocessing comes in several flavors, most notably symmetrical multiprocessing and massively parallel processing (Government Technology, December 1995). According to Davis, the major RDBMS vendors have built multithreaded database systems that divide tasks and process them in parallel across more than one CPU, thereby speeding up operations.
PARALLEL EXECUTION
Multithreaded relational databases have opened the door to a whole new level of performance and functionality by allowing systems to perform parallelism: the parallel execution of database operations. In traditional database architectures, every SQL query is handled serially in a single process that involved parsing the query, scanning for the data on disks, joining the data and then aggregating it to provide the answer. Not surprisingly, processing a SQL query serially on a massive mainframe database took time -- lots of time. "If a single query took 10 hours to do," said Davis, "you might not even ask the question."
Parallel execution takes the same SQL statement and creates subqueries out of it. If the system has 10 disks of data, then the database will send out 10 subqueries that can scan all the disks at the same time. Instead of one query taking 10 hours to process, the elapsed time might be 10 minutes. As a result, state agencies can ask questions of its database or give it transactions to process that the agency couldn't or wouldn't ask before because of the time involved.
NEW MEXICO
That kind of performance is just what New Mexico's Department of Public Safety needs if it's going to run an up-to-date law enforcement agency. NMDPS is building a statewide criminal justice information system from the ground up. The system will run seven major enforcement applications, ranging from computerized fingerprinting and criminal histories to incident-based reporting and digitized mugshots.
The system will pull in data sources from around the state -- over a high speed network -- into a distributed database, which will be available to law enforcement agencies throughout the state and to officers in the field. Providing the horsepower for the system will be 12 multiprocessing Alpha servers from Digital that will be strategically located in different sections of the state. Right now, four of the servers have been installed and are running Oracle 7 as the relational database.
Vigil said that NMDPS looked at the future several years ago and came to the conclusion that the state needed a robust information infrastructure if it was to maintain and enhance public safety. Automated fingerprint identification and digitized mugshots are just some of the more exotic things that the New Mexico state and local police need if they are going to effectively identify criminal suspects quickly and accurately. "You need horsepower to handle those requirements," commented Vigil. You also need the ability to share data and process queries rapidly. By combining Digital's 64-bit Alpha processors with Oracle's relational database, Vigil is confident that he has the information architecture for his state's future.
DATA REPLICATION
To help customers like NMDPS improve how data and databases are shared, database vendors have developed some new features. One problem resulting from the surge in networking is that organizations have only a single master copy of a database, but many users at many sites. Typically, when workers in branch offices across a city, county or state need access to a master database, issues arise concerning performance and network traffic. As the number of users accessing the database rises, the time it takes to run a query can lengthen considerably.
To offset this problem, database vendors have developed data replication. Considered by Davis to be still under development rather than a mature function, data replication enables copies of a database to be disbursed to different servers on a network. Any changes made to the data by users are automatically replicated, so everyone is constantly working with the same data, not different versions of it.
Data replication serves other purposes. It can create a usable or "hot" version of the database in case of failure at the primary site where the master copy is stored. And because there are now two separate servers available with the same data, agencies can separate production-oriented, transaction processing from decision-support applications. However, data replication has two drawbacks, according to Davis.
First, to carry out replication requires a network that is always up and running, otherwise the changes made to the data can't be replicated automatically. Second, data replication is currently restricted to the vendor's database. In other words, Sybase or another vendor can only replicate its own database; it can't perform the same function with a different database. "The issue is that no vendor has a full solution that covers the whole spectrum," said Davis. Further along in development is the use of gateways, which allow users to access different databases transparently. The gateways translate database commands so, for example, a user on an Oracle database can use Oracle's Open Gateway Technology to send an SQL query to an Informix system located elsewhere on the network. For agencies with a real mixed bag of databases, the key is to find a gateway that supports the broadest range of relational databases.
COMPLEX DATA
Today's databases are excellent at processing a transaction or query relating to simple alphanumeric data types -- character and numerical strings such as decimals, dates, times, social security numbers and so on. But when it comes to blocks of text, images and video, databases haven't a clue as to what they are. "The relational database can't do anything with it," said Davis. "It can't extract any intelligence because there are no access methods."
Until recently, relational database vendors have seen little need to support complex data types. But with the boom in the Internet, especially the graphically-oriented Web, needs are beginning to change. The same has happened with state and local government. Digitized, graphical images are being added to government databases at a growing rate. Property assessment boards are putting video clips of houses into their databases of assessment information. Police are storing digital images of mugshots and motor vehicle departments are storing digital photos of drivers in mass storage.
To search or process a query that might look for images by color or shape, an agency needs an object relational database. Several firms have produced object-oriented databases that allow customers to develop modules for handling these new kinds of data types. "It's all very complex stuff," commented Davis.
She added that the mainstream RDBMS vendors are starting to build modules that can be added to their existing database products. "The big issue here is that object database vendors say you need to build databases that handle objects from the ground up," said Davis. "The relational vendors, with their huge installed base, want to build these modules on top so their customers don't have to start their databases from scratch. The question is, who will be the most successful?"
For state and local governments with traditional database needs, the good news is that today's latest generation of relational database systems are capable of delivering significant performance improvements, flexibility and reliability compared to legacy mainframe databases. All these benefits are not cheap, however. To achieve maximum efficiency from its databases, AHCA has spent considerable time and expense to migrate, centralize and standardize legacy data to the new relational environment.
"But all of our dollars are well-spent in terms of productivity," said Lott. "We're so information-intensive today that we need good database tools, otherwise we would still be suffering."
For more information, contact Edwin Lott, 904/922-3651, or Raymond Vigil, 505/827-9119.
*
--------------------------------------------------------------------------------
Terminology
Database: 1) A set of interrelated files that is created and managed by a database management system. 2) Any electronically stored collection of data.
Database Management System: Software that controls the organization, storage, retrieval, security and integrity of data in a database. It accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.
Relational Database: A method for organizing files in a database that prohibits linking files together. In non-relational systems, records in one file point to the locations of records in another, such as customers to orders, and vendors to purchases. These links are set up ahead of time in order to provide very fast daily processing. In a relational database, relationships between files are created by comparing data, such as account numbers and names. A relational system can take any two or more files and generate a new file from the records that meet the matching criteria.
Ref: The Computer Glossary, Fifth Edition by Alan Freedman, Amacom.