Legacy Nation

The public sector is practically in a league of its own when it comes to data integration.

by / July 9, 2007
A decade ago, complying with federal Health Insurance Portability and Accountability Act (HIPAA) regulations meant a huge headache for South Dakota's Bureau of Information and Telecommunications.

That's because meeting HIPAA standards requires the mainframe and minicomputer systems of social services, human services and health departments to interact.

To induce this intermingling, South Dakota would have had to put COBOL programmers to work writing programs to extract data from one system to deliver it to users in another department, said state CIO Otto Doll. "The problem with that approach," he said, "is that when you make a change to any application, you also have to go back and change all those interfaces you've written."

Dissatisfied with the state's legacy system silos, Doll has gradually moved South Dakota toward a service-oriented architecture (SOA). Using translation middleware called ICAN from SeeBeyond (which was bought by Sun Microsystems in 2005), South Dakota has started building composite applications using adapters that can pull data from separate sources.

For the past few years, SOA has made HIPAA compliance much easier, Doll said. For instance, human services employees enrolling people in Medicaid can now access birth and death records from the Department of Health.

Doll said South Dakota got serious about the enterprise application integration (EAI) approach in 2000. "We saw a need to extend the longevity of these systems," Doll explained. "We don't have the resources to refresh our systems every couple of years. Nevertheless, we've been able to move forward with more complex interactions and access to different applications whether they are old or new."

Options have Pluses, Minuses
South Dakota's IT leaders aren't the only ones feeling pressure to link up data from older systems. Increasingly government services are intertwined both within and across agencies. Yet CIOs find themselves inheriting a hodgepodge of servers, mainframes and minicomputers that are difficult to integrate and don't lend themselves to business intelligence-type queries.

That may explain why the worldwide application integration and middleware market grew at a 7.1 percent clip between 2004 and 2005 to total $8.5 billion, according to Gartner.

The challenge of legacy data integration is most acute in the public sector, because unlike their private-sector counterparts, government agencies typically can't afford to regularly replace systems. Faced with regulations or business needs that require integration, CIOs have several ways to respond depending on their goals, timelines, budgets and risk aversion. Options include:

  • having in-house programmers write special-purpose programs to extract certain data. The limited scope may save money, but this labor-intensive approach requires programmers with the right skills. Also, programs written in COBOL and Assembler are aging, and the people who know how to work them are now retiring. Some CIOs running these applications report that no one on staff knows how to code them;
  • using ETL (extract, transform and load) technology to create external data warehouses or data marts that can then be queried with business intelligence tools;
  • turning to middleware vendors who develop interfaces to mediate between systems. This EAI approach, which is also referred to as enterprise information integration, can help keep data consistent across platforms. Though expensive and complex to implement, application interfaces can also be valuable for any transactional systems that require users to both read and write to the legacy system. Most middleware vendors and their customers are adopting the SOA concept and creating Web services that use XML-based open standards to enable communication between existing applications; and
  • transferring applications from mainframes or minicomputers to newer systems.

Extracting Value
Legacy systems can usually do what they were designed to do; the challenge is opening them up to constituents via the Web or allowing staff members to do analysis. "Internal staff may ask questions such as, 'How many taxpayers in our jurisdiction paid more than $20,000 in taxes last year?" said Grant Brodie, president and chief architect of Arbutus Software, a Canadian-based company specializing in legacy data access solutions. "With a traditional legacy system, there's no way to answer such what-if type questions."

Traditionally a city government would have a programmer write a COBOL program to extract that information. Very often, however, once personnel get answers to their initial questions, they ask follow-up inquiries. "Then another program would have to be written to answer that," Brodie explained. "COBOL is not very flexible."

Programmed special-purpose solutions are low-cost, but they are best suited for nonrecurring needs rather than ongoing ones.

To allow for more robust analysis, many organizations use ETL technology to create large data warehouses that can combine data from multiple internal sources. One benefit is that it balances the processing load by off-loading the data analysis to a separate server. Another is that once the ETL process is established, you can conduct as many different types of queries as needed.

Yet despite their advantages, data warehouses usually involve high front-end and maintenance costs, and the data soon grows out of sync with what's happening on the mainframe, so they're not well suited for real-time analysis.

The U.S. Food and Drug Administration (FDA) turned to ETL technology and a data warehouse to let human resources staff members integrate data from disparate systems and analyze trends using Business Objects' business intelligence software.

The FDA, which is a U.S. Department of Health and Human Services (HHS) agency, kept records on its approximately 9,000 civilian employees in a system that was custom-developed for the HHS, while data on uniformed personnel in its U.S. Public Health Service Commission Corps resided in a different system. Time and attendance records were in payroll software, and records on contractors were in another system.

"Pulling all that together and getting reports out was a frustrating, cumbersome process," remembered Ray Russo, director of the Office of Business Enterprise Solutions in the FDA's Office of the Chief Information Officer.

Russo's office created a data mart that pulls together data from all those sources. "This allows us to create historical views, as well as a snapshot for each pay period," Russo explained. "Now we can ask questions such as which employees are going to be eligible for retirement during certain time horizons. We can study attrition rates over certain time frames. There's no personnel-type question that we can't answer."

If the FDA's IT analysts still have concerns about legacy systems, Russo said, it has nothing to do with combining data and doing sophisticated analysis.

Otto Doll said South Dakota is moving away from the data warehouse approach to more just-in-time information. "We'd rather have more tightly integrated applications where we're dealing with the real thing."

Doll cites the federal government's sex offender databases as a model. The federal government could have created a huge data warehouse, pulling down information from all 50 states and refreshed it regularly, Doll said. "Instead they chose to ping all 50 state databases whenever there's a request."

If creating a data warehouse is time-consuming and expensive, rewriting an application or porting it to a newer platform can be even more troublesome. If you have unlimited resources, Arbutus' Brodie said, rewriting the application is a viable option. Yet most legacy systems still work well. "Replacing it for that reason," he added, "would be like killing a fly with a sledgehammer."

Eliminating the legacy system is a more drastic approach for CIOs to contemplate, admitted Federico Zoufaly, executive vice president of business development for ArtinSoft, a Costa Rica-based company that automates the translation of legacy code into more modern languages, such as Java. "It depends on your long-term strategy," he said. "If you're looking to gain flexibility, you may want to move to a newer architecture and off-load your mainframe applications gradually."

But Brodie said switching from a COBOL program to something like an Oracle database is a daunting task. "It's horribly expensive and problematic," he said. "There's no automatic button to push to do the conversion, so you're counting on people, and there's always the opportunity for errors to creep in. You don't want to do that unless you absolutely have to."

Integrating Justice Data in Illinois
Most people probably assume that if a burglary is committed in a metropolitan area, and the next day a similar burglary happens in the adjacent town, the police investigating the first burglary are aware of the second one. But more likely they aren't, said Kirk Lonbom, assistant deputy director of the Information and Technology Command of the Illinois State Police (ISP) in Springfield.

"Each jurisdiction has its own computer system and its own incident reporting software," he said, "and they don't talk to each other."

To break down those barriers, the ISP is creating the Illinois Citizen Law Enforcement Analysis and Reporting System (ICLEAR), a common data warehouse and standardized police incident reporting system that the state's 400 local police agencies and 40,000 officers can use to share information.

Based on a system used by the Chicago Police Department, ICLEAR is designed to help officers identify trends and allocate resources to drive down crime.

The ISP has its own legacy system issues to deal with, but Lonbom's problem is bigger than getting the 40 ISP applications to work together. "The real legacy data is not under our roof, but under 400 local agency roofs," he said. "Our challenge is to allow those local agencies to continue to use their systems but share information by using a common data format -- the Global Justice XML data model. Criminal justice systems are really coming together with specifications for data exchanges."

This year the ISP is working with iWay Software, a subsidiary of Information Builders Inc., to create specifications for the transfer of data between the ISP and the Chicago Police Department.

With funding from the Department of Homeland Security, the ISP will roll out an incident reporting system pilot this year. The ISP plans to provide the system free to jurisdictions in the state, with the hope of eventual statewide adoption. "The pilot projects we're starting with are small," Lonbom said, "but their potential is not."

600 Billion Lines of Legacy
Yet most IT leaders eager to replace their legacy systems admit that the process is expensive and will take years to accomplish. In a 2006 white paper, EDS estimated that mainframes execute 75 percent of all business logic at the enterprise level, and that there are approximately 600 billion lines of legacy code still in use.

While CIOs are waiting to replace their legacy systems, many realize that they have a responsibility to pursue greater flexibility through integration strategies because they are held to a higher standard of service than ever before. For instance, it's no longer deemed acceptable for a social services caseworker to scribble down information about a caller, access several systems and then call the person back.

Instead, that caseworker is expected to get a snapshot of the caller from an application that instantly queries all state systems. And whatever integration strategy is pursued, the CIO's job is to deliver that connectivity.

South Dakota's Doll is regularly reminded of the impact of EAI on his agency's clients. "When we meet to discuss something complex they need to accomplish," he said, "it's much less likely that the technology is going to get in the way."


David Raths is a Philadelphia-based writer focused on information technology. 

David Raths contributing writer