Tuning Spam Detection

Optimizing spam detection requires finding the righ balance of false negatives (spam messages incorrectly classified as legitimate mail) and false positives (legitimate messages incorrectly classified as spam). While most customers achieve an acceptable level of spam filtering from the default PureMessage configuration, ongoing tuning can optimize the spam catch-rate and minimize false positives.

PureMessage includes a large set of anti-spam rules used to identify message characteristics that indicate spam. Each rule has a positive or negative weight. When the policy script executes the "Spam probability" test, the set of anti-spam rules is tested against the message. When a message triggers an anti-spam rule, the weight is added to the message; the combined weights form the total score, which is converted to a percentage.

Within the PureMessage policy, actions are associated with a message's spam probability. These "thresholds" perform different actions based on the likelihood that the message is spam. For example, the default policy copies messages with a probability of 50% or greater to the quarantine; at the 20% threshold, the policy adds a custom header. Multiple thresholds and associated actions can be defined within your policy.

Tuning PureMessage can involve changing anti-spam rule weights, adding custom anti-spam rules, modifying policy rule thresholds, and populating lists for use by the policy (for example, whitelists and blacklists). These processes are described below.

Whitelisting

"Whitelists" are lists of senders and/or domains that are known to be legitimate sources of email. By default, email from whitelisted hosts and senders is delivered without being scanned for spam. See "Lists" in the Policy Configuration section of the Administrator's Reference for information about configuring whitelisted hosts and senders.

Blacklisting

"Blacklists" are lists of senders and/or domains that are known to be sources of illegitimate email. By default, email from blacklisted hosts and senders is quarantined without being scanned for spam. See "Lists" in the Policy Configuration section of the Administrator's Reference for information about configuring whitelisted hosts and senders.

Altering Anti-Spam Rules

PureMessage includes a large set of anti-spam rules designed to detect spam characteristics.

  • Changing Rule Weights and Probabilities: Only the weight and probability of default anti-spam rules can be changed. Rules can be viewed on the Anti-Spam Rules page of the PureMessage Manager, or using the pmx-spam command-line program. Existing rules can be disabled, or the weights can be altered. See "Adding and Configuring Rules" in the Policy Configuration section of the Administrator's Reference for more information.
  • Creating Custom ("Site") Anti-Spam Rules: Custom rules that use regular expressions to test for spam characteristics can be added on the Anti-Spam Rules page of the PureMessage Manager, or edit the site-specific anti-spam rule files. See re.rules for instructions on editing rules at the command line.

Adjusting Spam Thresholds

"Thresholds" apply actions to messages based on their spam probability. By default, PureMessage is optimized to detect spam at a 50% threshold. Quarantining or marking messages below this threshold significantly increases the number of false positives. It is recommended that administrators begin tuning filters with a threshold of 50% to 60% to best ensure that legitimate messages reach recipients and that spam messages are quarantined.