Using the Anti-Spam Engine

The code examples in the following pages demonstrate how to use the Anti-Spam Engine API. These examples are extracted from the sample.c program located in the eg/ directory.

Suggested Best Practices

Sophos recommends the following for ensuring high catch rates when using and integrating the anti-spam engine.

  1. Keep up to date with anti-spam engines: This can be accomplished by downloading from the "2.x-latest" area of the Sophos update page. Alternatively, if you point to specific version directories for testing purposes, try to test and publish the new engine as soon as possible once Sophos has announced it (ideally within a day or two).
  2. Keep up to date with anti-spam data: Create a mechanism to regularly check the Sophos site for updates, preferably once a minute. Each data update has an associated checksum that can be compared against the checksum of the last downloaded package. Using this method is improves efficiency because the data is only downloaded if the checksums do not match. Even though updates are not actually published every minute, minimizing latency as much as possible will improve catch rates. Apply any available updates quickly once they are available. Implement monitoring of data versions to ensure they are updating on a regular schedule. Data older than 1-2 hours can indicate an updating issue.
  3. Leave all default rules/checks enabled including network checks: Sophos recommends leaving all default rules and checks enabled in the anti-spam engine, including network checks such as DNSBL lookups and reverse DNS checks, (that is, the core.local-tests-only attribute should not be set to true). Network checks contribute significantly to the anti-spam catch rate. If you are doing specific DNSBL checks outside of the anti-spam engine and wish to avoid duplicating these queries in the engine, Sophos recommends disabling only those specific DNSBL rules in the engine, still allowing other network checks to occur.
  4. Ensure accurate spam source information is available to the engine: A number of checks rely on accurate information on the spammer's sending/connecting IP, including DNSBL-type rules and reverse-DNS rules. Specifying any IPs/relays that should be regarded by the engine as trusted/internal should be done using the plugin.net.trusted-relays attribute, allowing the engine to use the first untrusted relay as the spamming IP. You may also want to exclude internal hosts from network checks by specifying them in the plugin.net.internal-hosts attribute (10.0.0.0/8, 127.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 are automatically excluded). If applicable, DNS servers should be specified using the plugin.net.dns-severs attribute. Also, if the Anti-Spam SDK does not get email directly from a spamming host, there must be a received header added by a front-line MTA. The format of the Received header should resemble the standard ones (such as Sendmail or Postfix), so that the Anti-Spam SDK can parse it to determine the first untrusted relay.
  5. Use a 50% threshold: The Sophos-recommended threshold is 50% (or 0.5) for categorizing a message as spam. Some customers choose a higher threshold (for example, 90%) for more aggressive disposition, such as discarding, but this results in a higher potential for false positives.
  6. Add X-header(s) to messages for troubleshooting: Sophos recommends that customers add the following information to messages processed by the anti-spam engine to assist with any troubleshooting or investigation of miscategorized messages:
    • The version of the anti-spam engine in effect when the message was processed, which is reported by the core.version attribute.
    • The version of the anti-spam data in effect when the message was processed, which is reported by the core.data-version attribute.
    • The rules that fired when the message was processed, which is reported by the PMX_EV_FOUND_FEATURE event code. Ideally, include both parameters.
    Note
    Adding these headers is believed to have a negligible impact on performance in most situations relative to the spam engine processing. If this is not the case, you can add these headers only to messages that were not determined to be spam. This will allow analysis of missed spam, although analyzing false positives will not contain this information.
  7. Provide full header information with samples submitted to Sophos: Message samples should contain the full source including the spam X-header(s) added during processing and the full received chain.
  8. Report any missed spam or false positives to SophosLabs: Samples should include the full, intact headers including the anti-spam X-header(s) and the full received chain.
  9. Set up a spam trap: Sophos encourages customers to set up spam traps where mail from unused and old addresses or domains can be routed directly to Sophos. Traps should not receive any legitimate mail, only spam. Setting up a spam trap will help give Sophos visibility into any campaigns that are targeting your particular domains.