Data Warehouse Provides Coordinated View

How to add function and ease of use to a data warehouse.

by / May 31, 1997
About three months ago, most of the supermarket chains in the Boston area dispensed with clip-out discount coupons. Now, advertising circulars carry only announcements of the week's specials and "clipless" coupons -- shoppers get the sales price when they present their store discount cards to be scanned at checkout.

The chains promoted the change as a time saver for customers, but the biggest beneficiary is the chain itself. When the cards are presented -- whether or not customers purchase any sale items -- all the information about the purchase is captured. For chains with well-organized, funded and forward-looking IT departments, the information ends up in a massive data warehouse that feeds departmental data marts. Those are used to analyze the effectiveness of that week's circular, handle inventory and stocking predictions and measure the effectiveness of in-store promotions, among many other things.

Before the introduction of the card, transaction data could only be used for inventory and stocking purposes and analyzing very general sales and marketing trends. With the customer-specific data provided by the cards, sales data became more useful.

In government, the allure is similar, but the problem more complicated. DMV computers, for example, may only be able to "talk amongst themselves." Assessor's Office computers speak only "Assessor-ese," etc.

Furthermore, it isn't just a matter of getting systems to talk to one another. Legal and other issues restrict which systems are allowed to talk and what they are allowed to say -- personal tax return information, for example, can't end up in databases accessible to politicians, and police investigation records must stay within the police department.

Despite these problems and restrictions, governments at all levels need access to coordinated views of data. Data warehouses can provide that coordinated view, but more than that, they can help governments "grow their business" by helping identify and provide services to a wider range of constituents. But to make this work, state information managers need to deal with the threshold issues of standards and data definitions.

Massachusetts started its data warehouse initiative in January 1994 with the intention of providing a central financial information repository for internal use by state employees.

"It does contain a tremendous amount of data from the accounting system as well as the payroll cost system, and also much of the state budget information," said Joe Marinilli, information warehouse manager, Executive Office of Administration and Finance. "These are all individual mainframe systems we're pulling the data from. In the near future, human resource information will also be available."

A key factor in making the system work was putting the data warehouse definitions themselves online. "In the early days, we were keeping information about what was in the data warehouse and which source system they came from in hard copy, and it was getting into volumes," said Marinilli. "So we created -- in the warehouse itself, in table format -- two tables which are data definition tables. One contains information on all the fields in the data warehouse and which table they belong to. The second one lists which tables can be found in the data warehouse, what can be found in them and which mainframe they came from."

These data definitions are easy to query and available for read-only access to users. Because they are in standard table format and are more manageable in size than the data tables in the data warehouse -- 2,000 plus records compared to 15 million in some data tables -- the data definitions are usually used in beginning training classes. The state has also included "value added" information with the data definitions.

"Something we've done that's a bit different in the commonwealth," said Marinilli, "is to include algorithms for terms that are found in the accounting system. A user can look at these algorithms to see how the elements are generated. This kind of information satisfies users as well as accounting people."

Standardizing definitions internally within the data warehouse is one aspect of handling data definition. Another is conforming to external standards. For example, the Massachusetts data warehouse supports departments which report to the federal government and must conform to federal standards. Other departments use it to reconcile federal reimbursements.

Overall, Marinilli noted that storing the data definitions in the data warehouse has been helpful to both system administrators and users. It helps administrators in the data cleanup phase of populating the data warehouse and in ensuring consistency. It has benefited users by giving them easy access to a map of available data.

North Carolina's Health Services Information System (HSIS) is a software package developed and maintained by the state. It is used in 67 of the 87 local health departments. The other 20 departments have purchased their own private systems, all of which are required to meet the state reporting standards.

Information from each department is captured by the state and used for statistic reporting and analysis. This information is also linked with other records, such as birth records, Medicaid files and social services files, which enable the state to do more complicated analysis. Having such a data warehousing system in place provided a way to track immunizations throughout the state.

"We are supposed to be collecting information on everyone who provides immunizations in North Carolina," said Delton Atkinson, director of the State Center for Health Statistics in the Department of Environment, Health and National Resources. "Our state has adopted as one of its strategies a policy that it will pay for all vaccines in the public and private sector. Doctors get free vaccines, and for that, they have to report who they give them to."

Capturing all this information began by establishing a standard format for reporting. "What we did was develop a data exchange document," said Atkinson. "We used the Centers for Disease Control HL7 standards, and are giving that to every provider. We say, 'here's the HL7 standards, here's what we need, here's how the information is defined and here's how the data needs to come to us.' We couldn't get into their computer system, and didn't want to develop another computer system and say they had to use ours, so we decided to develop the data exchange document and see if their system could be modified to provide that."

The HL7 standard was a logical pick because it is used in a national immunization program.

"We decided that if we have to transmit the information across states, then we'd apply the same thing to us," said Atkinson.

The state started with the public-sector providers, since most were on the division's software, then moved to Medicaid providers and then to the big HMOs.

"Working with the small [general practitioner] in the small office is the hardest," said Atkinson. "The small practice is difficult and we haven't completed a lot of them at this point. They will be labor intensive and time intensive -- we've got a lot of rural areas in North Carolina. In some cases we're not going to get people to install a computer, and we may have to have them fill out a form and fax it to us -- but we recognized this was going to be a long process when we started."

Having a working data warehouse allowed the state to add a new set of data without starting from scratch. But the data warehouse itself has given rise to other efficiencies. For example, Medicaid billing is done centrally. Local health department staff have access to statewide Medicaid eligibility files, which they use to verify Medicaid numbers online. Staff members enter treatment information and the system creates a billing record, which is then transmitted to EDS Federal. The check goes from EDS Federal to the local health department, and a payment record is sent electronically to the state, which the system posts automatically.

"We haven't measured how much staff time this has saved," said Atkinson, "but prior to this, the billing was done from each local health department, and ... not all local departments did good billing. Now, they don't have to do anything but enter the data into the system at the visit."

North Carolina is also working to link state medical laboratories. When completed, the labs will be notified electronically to expect a specimen. Once the test results are available, lab staff will enter the information, which will post automatically to the patient's record in the local department.

At each level of automation, the state needed to establish or agree on common data definitions.

"We've been running this system (HSIS) for about 10 years," said Atkinson. "When we first started -- you talk about a war! All the public health systems had different systems. To get our health departments to agree was a horrendous battle. Some were only done by dictating. Where standards don't exist, it is not an easy task to do this kind of thing."

North Carolina is now working on a client/server version of HSIS using Sybase products. The individual counties were so impressed with a demonstration of the system several months ago that they voted to support it financially.

As important as standards are to any discussion of data format and data exchange, there is another, higher level of standardization which is often overlooked.

"Database is not about data, but about the quality of the data model," said Toby Younis, director of value systems engineering for Sybase. "How well the data model mirrors the business -- that's where the standards should be. If you were an accountant, you wouldn't work for a company that didn't have a set of policies and standards. If you were an architect, you wouldn't start building a building without a blueprint. The same is true of any business or government agency. To operate without a model that represents your business is courting disaster."

Younis sees the database as the place to implement the business model -- which always consists of business objects, transactions and rules (business policies or procedures). Business objects such as customers, accounts or property become objects defined by standard data definitions. The transactions between these objects are defined with SQL statements and routines. The business rules or policies are defined in stored procedures (a set of SQL statements that are stored in the database and can be called as a unit by client programs) or triggers (a database mechanism that initiates a user-defined action when an event occurs).

"There are at least three advantages to putting the model into the database," said Younis. "First, everyone gets the same view of the business. Second, when the business changes, the model is changed in only one place. Third, the business model is where the business is done -- in the business infrastructure. Since business now operates at the speed of light, the more accurate your business model, the more likely you gain and sustain a competitive advantage. If your business model is in three-ring binders on a shelf, the less likely you will succeed."

State and local governments will be faced with data definition and standards issues when implementing data warehouses or data marts. There are at least three solutions:

Establish the standards for use within the data warehouse itself.
Establish standards for data providers when possible.
Conform to standards set by higher levels of government.
Data warehouse design must take all of these into account to achieve minimal functionality. But to become truly effective, the business policies and practices written into the agency's business model must also be implemented as standards in any data warehouse or data mart.


David Aden
David Aden is a writer from Washington, D.C.