For 50 years, the United Nations has been custodian to the world's treaties. Now it has embarked on an ambitious project to convert each treaty into electronic media, making them available to Internet users and others without risk to the documents themselves.
Ron Van Note, information systems consultant at U.N. headquarters, is overseeing the project. "After much reflection and many discussions, we decided to go with imaging and optical storage as the best way to convert each page of every document. OCR (optical character recognition) input was briefly considered as an alternative methodology to imaging, but was quickly dismissed, primarily based on the need for complete document protection," said Van Note. "OCR, at best, is considered to be only 90-95 percent accurate. It would cost millions to ensure the necessary 100 percent accuracy due to the multiple languages used throughout the treaties."
Why Choose the Optical Option?
A treaty never changes. It can be modified by adding to it, but the original is never touched. Optical's permanency is ideal for this type of archiving. Optical storage has the capacity to convert large volumes of material -- including important pictures or drawings -- into secure, accessible and permanent files. Local and state agencies also receive documents submitted by external sources. These documents are delivered as hard copy and may not lend themselves to traditional storage methods. Data entry is costly and subject to human error. Character recognition software is not perfect, and many documents cannot be converted by this technology. The document imaging and optical storage combination may hold the answer.
Imagine the many storage systems in place in any one municipality, or how documents must be cross-indexed with other agencies. Using jukebox storage capacities, the sting can be taken out of organizing documents from a variety of storage systems and media.
"Going in, we knew this was a big job and that there were bound to be some hurdles," said Steve Lehrer, product manager at Liberty Information Management Systems (Liberty IMS), the Costa Mesa, Calif.-based software integrator chosen to implement and oversee the treaty conversion project. "The first priority was the protection of these extremely important world documents. The treaties had to be secure and the integrity of each page assured during the document input process. Afterward, when residing in electronic storage and being made available to outside users, document protection and security had to be irrefutable."
Liberty IMS did hit stumbling blocks during the conversion project. For example, when migrating to electronic storage, the United Nations attempted to scan copies of each treaty in microfiche, but the quality was too poor to guarantee 100 percent accuracy, so it resorted to using the original paper pages.
In another example, Van Note said under the organization's manual library system, the documents were stored by volume in rows of floor-to-ceiling bookcases. Volumes were of varying sizes and covered any number of years and treaties. There was no existing cumulative cross-volume index to aid in searching the documents. This made locating a specific treaty cumbersome and time-consuming, frustrating employees and document requesters. "It took a lot of work to find a treaty, locate specific information within the document and then make a copy for the requesting party," said Van Note.
The U.N. library's indexing problem was uncovered during the scanning process. Liberty and their partners had to do a lot of new indexing -- each of the more than half-million pages was indexed during the scanning process. Now, each volume is indexed by treaty numbers and addenda; each page is indexed by volume number, treaty number and language used. Placing the treaties online will speed up the search process and give outside users direct access to the documents.
Another important aspect of paper-to-electronic document conversion is the effect of multiple language formats. The United Nations requires treaties to be written in English, French and the languages of the participants. As a result, there are often four or more languages contained within a treaty, but there seem to be no rules about format or the placement of the multiple languages within each document. Lehrer said one page may have the left column in English and the right one in French, the following page may have this reversed. The next page could have the top of a page in one language and the bottom of the page in another. "This posed a special problem to our software developers," he said.
To deal with the problem, Liberty IMS software engineers created an algorithm to automatically sort and separate each language within a treaty; the software then re-compiles each page as it is scanned in.
Overcoming the Obstacles
The use of multiple languages, the lack of a comprehensive volume-to-volume index, and the need to absolutely guarantee the security of every document were the major challenges faced by Liberty IMS and their partners. These obstacles were quickly overcome by modifying the standard Liberty software to handle the language separation task and provide new indexing during the scanning-in process. All U.N. treaties were 100 percent protected and secured using a combination of imaging technology and optical jukeboxes equipped with Write Once Read Many (WORM) media.
"Our Network Information Management software is the main package involved in this very important venture," said Lehrer, "and we're responsible for coordinating the overall implementation of the document imaging and optical storage applications task. We are providing the necessary software packages, hardware components and document conversion services to place all of the 600,000 current pages of treaties -- and new pages as fresh treaties are added -- into electronic form. The pages will be kept online for ease of access and are to be fully integrated into the U.N. Treaty Department's existing LAN."
Electronically storing those 600,000 documents perfectly, accurately and securely, in a minimal-size storage environment and with desktop accessibility is a model for the future. The online storage and almost immediate availability of the world's treaties are now a reality at the United Nations. The system is fully operational, making the documents accessible throughout the U.N. Treaty section.
Phase two is now in progress. It will provide treaty document access throughout the entire organization and outside its walls. Liberty IMS is developing an Internet server to place the treaties on the World Wide Web.
Ron Levine is a freelance writer based in Carpinteria, Calif. specializing in networking, storage devices and emerging technology.
October Table of Contents