Finding Connections: Data Mining and the War on Drugs

A Florida police department develops a new automated investigative network to fight drug trafficking.

by / January 31, 1998
A kid's game, sometimes called "Concentration," is a test of a player's memory. With a set of cards dealt face-down on a table, players take turns picking up two cards, hoping to find they are of the same value. If they don't match, the cards are returned to the table face-down. The challenge is to remember the location and value of the cards you or the other players pick up so in subsequent turns you're able to match them up. The player with the most pairs wins.

Without proper investigation tools, police and law enforcement agencies may find themselves unwillingly playing their own game of "Concentration" as they wade through mountains of information available online. This is more true when investigating possible narcotics offenders. By its very nature, drug trafficking involves some organization -- transportation, distribution and sales "channels" -- all of which try to keep themselves hidden. Investigating such organizations may require correlating apparently unrelated information and finding submerged links between people and organizations. The growing amount of online data could make this work easier if access to it was automated.

St. Petersburg
Florida's St. Petersburg Police Dept. had more online data than could be known or effectively used by detectives in the course of doing routine investigations.

"We had 10 years of data in a fairly sophisticated database, and we could extract fields and look up data," said Leonard Leedy, a vice and narcotics detective with the St. Petersburg Police Dept. "We have also been sharing data with the Pinellas County Sheriff's office for 4 to 5 years. What we've been unable to do is extract information from the narratives of our reports."

Under the department's system, officers enter basic information about an incident into an online form. This form includes such things as the names of involved parties, the time of the incident, etc. What officers do not type themselves are the narrative descriptions of the incident. Instead, they dictate the narratives, which are later transcribed. This method saves officers time and has proven a much more cost-efficient division of work. When the narratives are later transcribed, they are electronically associated with the basic data entered by the detectives. Although the narrative data has technically always been available online, the existing software couldn't search the narratives, let alone do sophisticated data mining to help identify links.

Development of a system to access narrative data began a couple of years ago when representatives of the federal Counterdrug Technology Assessment Center (CTAC) approached the department. CTAC falls under the Office of National Drug Control Policy and is the "central counterdrug enforcement research and development organization of the U.S. government."

"CTAC ... said they were interested in a project involving sharing of data and technology," said Leedy. "They sat 30 narcotics detectives down and asked them, 'If you wanted to have a computer that could do anything, what would it be?'"

Following the initial meetings, the University of Tennessee got involved to do the application development. Rather than suggesting theoretical solutions from afar, the university took the time to really find out what was needed.

"The university sent people down and they rode along with us to get a feeling for what life was really like," commented Leedy. "They came down and listened to us and the applications were developed based on the voiced needs of the working detectives"

This development style is well known to CTAC's chief scientist and director, Albert E. Brandenstein. "I did a lot of command and control development with ARPA [now DARPA, the Defense Advanced Research Projects Agency] and other places," said Brandenstein. "Those kinds of projects are measured by hands-on success from the very beginning. You can't go away and develop for three years -- it's a matter of how the users like data presented to them."

What came out of this design phase was a plan for several applications, collectively known as the West Florida Counterdrug Investigative Network (WFCIN). According to Brandenstein, the system included the first use of an ATM network for state or local law enforcement.

The first application to be implemented gives officers the ability to data mine through information contained in the online narrative reports.

This is done using a Web interface that talks through a backend process to the existing database system. Both the interface and the backend were developed by the university. For example, if a detective receives a report of an incident in which a drug dealer used a specific type of gun, he can enter the gun type and get back a standard HTML page with links to narrative reports in which that type of gun is mentioned.

The application runs on a Sun workstation that was chosen, at least in part, because it provides a lot of freedom to scale the application -- downwards for other agencies with less online data -- and upwards as a department's data store grows. It has also proven very fast in doing searches. Currently, the application only accesses St. Petersburg Police Dept. databases, but future phases will extend the search to other jurisdictions so data can be shared between agencies.

The second application to be implemented stores images -- surveillance photographs, evidence photographs or scanned newspaper articles.

"It stores the images and links them to the case," said Leedy. "The University of Tennessee wrote a very neat package we use to scan our photographs using standard software -- we have an HP 4C scanner and digital cameras and use the shareware Paintshop Pro (by JASC Software) to store the images on the hard drive. Using the WFCIN application, you go into the image import program, which grabs the image. The application automatically creates the thumbnails, scales them and stores them linked to the case.

"Say you have 100 images on a case, and one is a photograph of a gun under a bed. The user can go into the comments section and type in 'gun under bed.' The application links the comments to the photo and stores that information. When you are done, the comments become part of the data mining source file; so now, if you type in 'gun under bed,' it will not only search the narratives, but also the photos associated with it," said Leedy.

Searches will return links to both the text-based narrative and the images. The network's image-carry capacity extends to realtime audio and video for teleconferencing -- or for sharing video monitoring tapes with law enforcement officers in other jurisdictions -- although these applications have yet to be implemented.

Another WFCIN application, which is still about 3 to 6 months from general use, will provide "link analysis." Link analysis is a tool that graphically displays connections or "links" between individuals, groups and organizations.

"This has been done in the Dept. of Defense (DOD) for years," noted Brandenstein, "but our goal was to scale the applications to the users and to make the technology affordable. Link analysis is commonly done on Crays in the DOD -- now it is being done on Sun workstations."

The link analysis application starts with a circle in the center -- this could represent an individual or an organization -- and then draws lines to circles representing other groups or individuals to whom the central person is connected. Those circles, in turn, will have lines drawn to their connections. This kind of functionality can help locate associations that otherwise would have gone unnoticed and that may be key to developing a case or directing an investigation.

The Complete Package
CTAC's part in the WFCIN project is to help develop these applications and then provide a means for exporting and customizing them for other jurisdictions.

"This year, Congress has appropriated a technology transfer pilot program to take some of this technology and widen the audience," said Brandenstein. "We were also directed to come up with area experts to help decide on where this goes and how it is to work."

To accomplish this, CTAC planned a meeting early this year to bring together experts from around the country to review the technology and help advise on the direction for future research.

The data mining applications being developed and used by St. Petersburg Police Dept. are still in the beginning stages, but they show the promise of things to come. A complete, integrated package that allows investigators to search across jurisdictions for common characteristics and build link analysis charts to help identify the key culprits and their associates will help bring crime investigation and prosecution into the 21st century. Such tools should help take police out of the business of playing "Concentration" with online data and put them more solidly in the business of directing and completing investigations.

February Table of Contents

David Aden
David Aden is a writer from Washington, D.C.