Improved scanners and recognition software increase the accuracy and speed of forms processing.



PROBLEM/SITUATION: Extracting data from paper forms, such as tax returns and licenses, is a labor-intensive operation.

SOLUTION: Optical forms processing.

JURISDICTION: Maine Department of Inland Fisheries and Wildlife, New Mexico.

VENDORS: Wheb Systems Inc., Harvey Spencer Associates, Microsoft.

CONTACT: Danny Morris, Maine Dept. of Inland Fisheries and Wildlife, 207/287-5241.


By Tod Newcombe

Contributing Editor

In government, forms processing always meant one of two things. Either an agency hired a small army of human operators to key in data from paper forms, or the forms -- and the information they contained -- were simply filed away, to become a "public record" that nobody could reach or use.

Now faced with business-like pressures to cut operating costs and make better use of the information collected, state and local governments are beginning to deploy forms processing technology as a means to extract data from forms faster, more effectively and at lower cost.

What makes forms processing a viable alternative to manual data entry is the steady improvement in software that can accurately read machine-typed or hand-printed letters and numbers on a sheet of paper. "Recognition technology has improved substantially to a stage where you can partially automate data entry," said Harvey Spencer, president of Harvey Spencer Associates, a consulting firm.

Spencer is quick to emphasize "partial" automation, because recognition technology -- such as optical character recognition -- while much improved, is not perfect. As a result, data entry operators will not disappear from the scene. However, they are more likely to be editing data captured by the forms processing system, rather than entering it themselves, according to Bill Reh, vice president of sales for Wheb Systems Inc., a software firm that specializes in forms processing.

Reh said the market for forms processing is growing rapidly ($1 billion by 1998) in part because document imaging has become much more acceptable as an office automation tool. In addition, the widespread use of the Windows graphical interface has also made it easier for forms processing to expand.


Forms processing is a subset of imaging, workflow and document management. But unlike imaging, which retains the document image for storage and retrieval, forms processing retains only the data on the document image. In order for that to happen successfully, forms processing must pass through several phases.

First, the paper form must be well-designed before it can be processed. Using clearly delineated boxes to boost legibility of characters, specifying date fields as MM/DD/YY to ensure consistent responses and printing the form in non-reproducible colors of ink so only data appears after scanning, can significantly boost the accuracy of the final result.

Second, the form must go through the scanning and imaging process. A good system will have a scanner that can handle the volume of forms processed on a daily basis. The imaging software should be able to de-skew (straighten the image), register and clean up a form image before character recognition takes place. The ability to process multiple formats and to distinguish between different forms is also a plus.

Third, the data on the form must be extracted using recognition software. According to Reh, a good forms processing system will have several recognition engines, if necessary, to do the job. Optical character recognition can handle the many different typefaces of machine-printed characters. Intelligent character recognition software reads handprinted characters, and optical mark recognition is designed to handle check boxes on a form.

Fourth, the data from the form must be verified. Because forms processing requires 100 percent accuracy, human intervention is unavoidable at this stage. However, forms processing systems