Arlington County Cans Spam

IT staff wages war on electronic junk mail.

by / October 13, 2003
Arlington County, Va.'s spam problem wasn't obvious, at least at first.

"No one was complaining," said David Jordan, chief information security officer for Arlington County.

After seeing several news reports on the swelling tide of electronic junk mail, however, Jordan decided to explore what harm spam was inflicting on county operations.

"Your security officer wants more spam," declared an electronic newsletter he distributed last fall to 3,500 county employees. Users obliged with a flood of forwarded e-mails hawking everything from mortgages to Russian brides to child pornography. The volume of junk on the network reached artery-clogging proportions.

"About 20 to 25 percent of all message traffic we had been receiving was spam," Jordan said, adding that the percentage isn't as important as the waste it represents. "It doesn't really matter whether it's 2 percent of the message traffic or 30 percent. It's 2 percent we don't need."

So along with protecting Arlington County's computers from viruses and barring traffic from countries that harbor terrorists, Jordan added spam prevention to his to-do list.

He wasn't alone. Corporate and government officials are realizing spam is more than an inconvenience, said Sara Radicati, president and CEO of the Radicati Group in Palo Alto, Calif.

As part of a survey, her research firm recently asked respondents to list major plans for their messaging systems.

"The top priority that came back was reducing spam," she said. "This is the first year that's happened."

Excess Infrastructure
The problem is spam forces organizations to expand network capacity, Radicati said.

"You're spending more all over on your infrastructure, in terms of more servers, more networking, all kinds of things, to support traffic that's trivial and unnecessary," she said.

Along with inflating overhead costs, spam chews up employee time, said Chris Miller, a group product manager at Symantec Corp. Extra time is spent deleting unwanted messages and retrieving legitimate mail accidentally deleted along with the trash. Some content is offensive, Miller added, and users exposed to pornography at work might sue employers for failing to block that mail.

"You're providing the infrastructure," he said. "You have to maintain it and make sure it's clean."

For these reasons, Jordan and two county network engineers set out to eradicate spam from the network. Arlington County uses Symantec tools for network security, and beta tests the company's new products. To combat spam, Jordan deployed tools from Symantec's AntiVirus for SMTP Gateways suite, which combine spam blocking and virus protection.

Software in this suite scans incoming mail for unwanted content. The goal is to stop spam at the mail server before it invades other servers and desktop systems, Miller explained. Along with anti-virus software, the package provides four layers of spam filtering, using techniques such as subject line filtering, heuristics, black lists and white lists.

Arlington County first tried subject line filtering. Jordan and the engineers used the software to build a database of keywords and phrases that, when found in a subject line, usually indicate junk mail. Since compiling the initial list, they have continued to scrutinize incoming mail to stay ahead of new tricks devised to get around filters.

"We have about 50 different spellings of 'Viagra' filtered," Jordan said. "We have probably close to 4,000 keywords we now use in subject lines."

Subject line filtering requires some labor and commitment, but it's effective against junk messages, he said, noting that once the county put it in place, staff started to filter thousands of spam messages each day.

Seeing Spam
Heuristics was not part of Symantec's suite when Arlington County started its war on spam, Jordan said, but when the vendor introduced that filter, he added it to his arsenal. The software in this layer recognizes spam by analyzing mail content. Rather than look for certain keywords and symbols, it also considers factors such as frequency and patterns of use.

"It's learned what thousands of spam messages look like and what thousands of good messages look like," Miller said, explaining that the filter employs statistical analysis to differentiate between messages when presented with new content.

"Heuristics is nice because it doesn't take any real maintenance for us," Jordan said, because no one needs to build and maintain a database. "We turn it on, and we measure the results. It has five levels of scrutiny you can set, depending on what your threshold of pain is relative to the potential for false positives."

Strict scrutiny blocks more spam, but may snag legitimate mail that has spam characteristics -- especially electronic newsletters. Looser scrutiny leads to fewer false positives, but also lets more junk through. Arlington County sets it heuristics filter at a moderate level -- two or three on a scale of one to five. "That lets a little bit of spam in, but that's a little bit we can live with," Jordan said.

A third layer provides filtering via real-time black lists. About 125 third-party services maintain lists of e-mail addresses or domains that are known spam sources, Miller said. Most charge little or no money for a subscription. Once a user signs up on one or more services, Symantec's software blocks messages from senders on those lists.

Arlington County uses one of those services now and might try another to see which gives the best performance, Jordan said.

The fourth filtering layer, the white list, stores the e-mail addresses of people who do business with the county on a regular basis. It makes sure their messages always get through, even if some element of the message raises a red flag in another filtering layer.

Fewer False Positives
Maintaining a white list and fine-tuning the heuristic screening-level are two ways the county makes sure the anti-spam filter works without causing unnecessary grief. The potential for false positives is the biggest drawback to a spam-filtering program, Jordan said. No one wants to set off a barrage of complaints from citizens, department heads or county board members whose messages aren't getting through.

That makes education an important part of any anti-spam program. Users need to understand that false positives will sometimes occur, and administrators are working to keep them to a minimum, Jordan said. They also must learn how to avoid problems in their e-mails.

"You can't address an e-mail anymore that says, 'Hey you, what's happening?'" Jordan said, because that sounds like a spam subject line and could get caught in a filter. Employees need to take a more formal approach to business communications, he said. "If you do that, your chances of being filtered are negligible."

Implementing anti-spam software effectively won't be painless, Jordan cautioned. It requires buy-in at the top, mass communications at the bottom and commitment in the middle, where people understand introducing new technology can be disruptive.

Symantec and other vendors could help the education process by providing automated reporting tools in their anti-spam software, Jordan said. Reports that clearly show how well the software performs encourage government executives to continue funding these technology initiatives.

Jordan said he creates such reports, but must construct them from data drawn from activity logs.

Though the county's campaign against spam has delivered benefits, Jordan said he hasn't calculated the value of each message blocked. "We think of it in larger terms," he said. "When we're handling so many billion bytes of data through the network, and 20 percent of it is trash, that's significant. That's like running your water at night when you go to sleep. It's a waste of resources. In local government, we don't have resources to waste."
Merrill Douglas Contributing Writer
Platforms & Programs