October 13, 2003 By Merrill Douglas
"It's learned what thousands of spam messages look like and what thousands of good messages look like," Miller said, explaining that the filter employs statistical analysis to differentiate between messages when presented with new content.
"Heuristics is nice because it doesn't take any real maintenance for us," Jordan said, because no one needs to build and maintain a database. "We turn it on, and we measure the results. It has five levels of scrutiny you can set, depending on what your threshold of pain is relative to the potential for false positives."
Strict scrutiny blocks more spam, but may snag legitimate mail that has spam characteristics -- especially electronic newsletters. Looser scrutiny leads to fewer false positives, but also lets more junk through. Arlington County sets it heuristics filter at a moderate level -- two or three on a scale of one to five. "That lets a little bit of spam in, but that's a little bit we can live with," Jordan said.
A third layer provides filtering via real-time black lists. About 125 third-party services maintain lists of e-mail addresses or domains that are known spam sources, Miller said. Most charge little or no money for a subscription. Once a user signs up on one or more services, Symantec's software blocks messages from senders on those lists.
Arlington County uses one of those services now and might try another to see which gives the best performance, Jordan said.
The fourth filtering layer, the white list, stores the e-mail addresses of people who do business with the county on a regular basis. It makes sure their messages always get through, even if some element of the message raises a red flag in another filtering layer.
Fewer False Positives
Maintaining a white list and fine-tuning the heuristic screening-level are two ways the county makes sure the anti-spam filter works without causing unnecessary grief. The potential for false positives is the biggest drawback to a spam-filtering program, Jordan said. No one wants to set off a barrage of complaints from citizens, department heads or county board members whose messages aren't getting through.
That makes education an important part of any anti-spam program. Users need to understand that false positives will sometimes occur, and administrators are working to keep them to a minimum, Jordan said. They also must learn how to avoid problems in their e-mails.
"You can't address an e-mail anymore that says, 'Hey you, what's happening?'" Jordan said, because that sounds like a spam subject line and could get caught in a filter. Employees need to take a more formal approach to business communications, he said. "If you do that, your chances of being filtered are negligible."
Implementing anti-spam software effectively won't be painless, Jordan cautioned. It requires buy-in at the top, mass communications at the bottom and commitment in the middle, where people understand introducing new technology can be disruptive.
Symantec and other vendors could help the education process by providing automated reporting tools in their anti-spam software, Jordan said. Reports that clearly show how well the software performs encourage government executives to continue funding these technology initiatives.
Jordan said he creates such reports, but must construct them from data drawn from activity logs.
Though the county's campaign against spam has delivered benefits, Jordan said he hasn't calculated the value of each message blocked. "We think of it in larger terms," he said. "When we're handling so many billion bytes of data through the network, and 20 percent of it is trash, that's significant. That's like running your water at night when you go to sleep. It's a waste of resources. In local government, we don't have resources to waste."
You may use or reference this story with attribution and a link to