Forms Clean Up Their Image

Improved scanners and recognition software increase the accuracy and speed of forms processing.

by / November 30, 1995
Improved scanners and recognition software increase the accuracy and speed of forms processing.


PROBLEM/SITUATION: Extracting data from paper forms, such as tax returns and licenses, is a labor-intensive operation.
SOLUTION: Optical forms processing.
JURISDICTION: Maine Department of Inland Fisheries and Wildlife, New Mexico.
VENDORS: Wheb Systems Inc., Harvey Spencer Associates, Microsoft.
CONTACT: Danny Morris, Maine Dept. of Inland Fisheries and Wildlife, 207/287-5241.


By Tod Newcombe
Contributing Editor

In government, forms processing always meant one of two things. Either an agency hired a small army of human operators to key in data from paper forms, or the forms -- and the information they contained -- were simply filed away, to become a "public record" that nobody could reach or use.

Now faced with business-like pressures to cut operating costs and make better use of the information collected, state and local governments are beginning to deploy forms processing technology as a means to extract data from forms faster, more effectively and at lower cost.

What makes forms processing a viable alternative to manual data entry is the steady improvement in software that can accurately read machine-typed or hand-printed letters and numbers on a sheet of paper. "Recognition technology has improved substantially to a stage where you can partially automate data entry," said Harvey Spencer, president of Harvey Spencer Associates, a consulting firm.

Spencer is quick to emphasize "partial" automation, because recognition technology -- such as optical character recognition -- while much improved, is not perfect. As a result, data entry operators will not disappear from the scene. However, they are more likely to be editing data captured by the forms processing system, rather than entering it themselves, according to Bill Reh, vice president of sales for Wheb Systems Inc., a software firm that specializes in forms processing.

Reh said the market for forms processing is growing rapidly ($1 billion by 1998) in part because document imaging has become much more acceptable as an office automation tool. In addition, the widespread use of the Windows graphical interface has also made it easier for forms processing to expand.

Forms processing is a subset of imaging, workflow and document management. But unlike imaging, which retains the document image for storage and retrieval, forms processing retains only the data on the document image. In order for that to happen successfully, forms processing must pass through several phases.

First, the paper form must be well-designed before it can be processed. Using clearly delineated boxes to boost legibility of characters, specifying date fields as MM/DD/YY to ensure consistent responses and printing the form in non-reproducible colors of ink so only data appears after scanning, can significantly boost the accuracy of the final result.

Second, the form must go through the scanning and imaging process. A good system will have a scanner that can handle the volume of forms processed on a daily basis. The imaging software should be able to de-skew (straighten the image), register and clean up a form image before character recognition takes place. The ability to process multiple formats and to distinguish between different forms is also a plus.

Third, the data on the form must be extracted using recognition software. According to Reh, a good forms processing system will have several recognition engines, if necessary, to do the job. Optical character recognition can handle the many different typefaces of machine-printed characters. Intelligent character recognition software reads handprinted characters, and optical mark recognition is designed to handle check boxes on a form.

Fourth, the data from the form must be verified. Because forms processing requires 100 percent accuracy, human intervention is unavoidable at this stage. However, forms processing systems do have features to simplify and speed up this verification and editing process. In some instances, a system can use a lookup table for employee names (in the case of processing time sheets) or Social Security numbers and addresses (in the case of processing licenses) to verify data it has recognized.

Another speedy feature is a technique called ribbon editing, where hard-to-recognize characters are strung along in a linear fashion, allowing an operator to read and edit the data in a more fluid manner. A more widely practiced form of verification is to have the system display the image of the original form on one side of the computer screen while the system's best guess at a questionable character is displayed on the other side. The operator can accept or override the character choice.

Accuracy is, of course, the bottom line requirement if a forms processing system is going to succeed. Reh said that he's seen acceptance rates -- characters that are recognized correctly -- as high as 96 percent for handprinted characters and 99 percent for machine-printed characters.

However, recognition rates can fall once substitutions factor in -- those characters the system thinks it's got right but are actually wrong. Yet even with substitution problems, forms processing systems can still deliver. If a system actually recognizes just 50 percent of the characters, that still means you've halved the number of keystrokes needed to enter data, according to Spencer.

As recognition improves, labor costs decline. In medium- to high-volume forms processing applications, where the system's overall accuracy is 95 percent or more, productivity gains of 50 percent to 70 percent are not uncommon. At this level, labor savings can lead to system paybacks within 12 to 18 months.

In government, forms processing is expected to have a big impact in the area of licensing and tax processing. Already, Wheb has a tax processing system installed in New Mexico that is expected to cut labor costs in half. In Maine, the Department of Inland Fisheries and Wildlife recently installed a $230,000 forms processing system from Wheb that's expected to significantly improve data gathering and generate some revenue as well.

The department has begun to overhaul and simplify the way it handles the 40 different types of hunting and fishing licenses it sells each year. Agents have always submitted either typed or handprinted license applications to the department. While sales information was entered into an accounting system, information about who bought the license remained on the form, which was filed away.

One major change will be the use of electronic point-of-sale technology to capture fees and data for licenses. Danny Morris, information services manager for the department, figured that the state's biggest licensing agents -- L.L. Bean and others -- will handle roughly 80 percent of the 500,000 licenses issued annually with this new format. The remaining 20 percent will still be issued on paper by hundreds of mom and pop stores located in small towns and villages. It's these licenses that will be processed with technology.

In June of this year, the department installed Wheb's Intelligent Forms Processing System, which includes the processing software, a Ricoh scanner, a Pentium file server and three editing stations. For the next five months, the department tested the system, with changes made primarily to the form to improve recognition rates.

According to Morris, the department expects to see several benefits once the system goes into full production with the 1996 licenses. First, labor costs for existing data entry -- mainly from the monthly reports submitted by agents -- will be significantly reduced. Second, more information on the license holders will be gathered, including some federally mandated demographic information on hunters of migratory birds.

Third, the department expects to generate revenue from the information. "We'll be able to remarket the data and sell mailing lists, which we're allowed to do by the state Legislature," said Morris. "It's all part of the push for more revenue enhancement."