pmx-qdigest - Generate digests of quarantined email


   pmx-qdigest                          # send configured digests
   pmx-qdigest --quiet                  # good for scheduled jobs and scripts
   pmx-qdigest --verbose                # be verbose
   pmx-qdigest --dry-run                # test digest generation
   pmx-qdigest --min 1000 --max 2000    # specify which messages to include
   pmx-qdigest --earliest "YYYY-MM-DD hh:mm:ss" --latest "YYYY-MM-DD hh:mm:ss"
       # specify messages by quarantine time and date
   pmx-qdigest --digest=DIGEST          # specify which digest to send
   pmx-qdigest --addr=ADDRESS           # specify which address(es) to include
   pmx-qdigest --scan-only              # scan the quarantine, don't send mail
   pmx-qdigest --send-only              # send mail, don't scan the quarantine
   pmx-qdigest --dump                   # dump the state database
   pmx-qdigest --index                  # include the most recent messages
   pmx-qdigest --no-index               # skip recent messages
   pmx-qdigest --help                   # help on options


The pmx-qdigest program generates 'digests' of quarantined messages. Digests contain a summary listing of messages that have been quarantined by PureMessage. These listings are sent to the users to whom the messages were originally addressed. The pmx-qdigest program places messages in the queue; messages in the queue are delivered by the pmx-queue program.

Users can release their quarantined messages by replying to digests; the pmx-qdigest-approve program handles the release requests.

There are two digest generation modes: local and centralized. In local mode, pmx-qdigest only scans messages in the filesystem-based quarantine that is local to the machine running the pmx-qdigest program. In this mode of operation, the digest state information (such as digest ID and the time pmx-qdigest runs) is also stored on the local machine. Multiple PureMessage servers can be configured to generate local digests.

In centralized mode, the pmx-qdigest program generates digests based on messages stored in the centralized, DBMS-based message quarantine. Digest state is stored in the central database as well. Only one machine on the network should be configured to run centralized digests.

The digest mode is specified in the centralized configuration parameter in the etc/pmx-qdigest.conf file. This configuration parameter can have values of true or false. The default is false.

Depending on the mode of operation, the pmx-qdigest program accepts different options for specifying message selection criteria. See Options for more information.


There are several options for debugging digests or generating one-time digests:

Suppresses all nonerror output. This is a good idea in scheduled jobs or scripts.

Normally, pmx-qdigest prints one line per digest it sends:
   Sending digest 'spam' for <>: 10 messages
   Sending digest 'spam' for <>: 1 message

With one --verbose option:

   pmx-qdigest: scanning messages starting from 1150
   pmx-qdigest: scanned 942 messages.
   Sending digest 'spam' for <>: 10 messages
   Sending digest 'spam' for <>: 1 message

If you repeat it, the output gets even more noisy:

   pmx-qdigest: scanning messages starting from 1150
   pmx-qdigest: message 1150
   pmx-qdigest: including message 1150 in digest for
   pmx-qdigest: message 1151
   pmx-qdigest: including message 1151 in digest for
   pmx-qdigest: including message 1151 in digest for
   pmx-qdigest: message 1152
   pmx-qdigest: including message 1152 in digest for
   pmx-qdigest: scanned 942 messages.
   Sending digest 'spam' for <>: 10 messages
   Sending digest 'spam' for <>: 1 message

Implies --verbose. Does not actually send any email or write digest files into var/digest/pending.

Specify those messages (by message ID) that should be selected from the message store. These options are available only for local mode of operation. By default this is:
   --min 'last_digest_id'+1

--earliest ``YYYY-MM-DD hh:mm:ss''
--latest ``YYYY-MM-DD hh:mm:ss''
Specify those messages (by quarantine time) that should be selected from the message store. These options are available only in centralized mode of operation. By default, all the messages quarantined after the last message scanned in the previous invocation of the pmx-qdigest program are scanned.

--digest DIGEST
Specifies the digest to generate. Normally, all digests defined in etc/pmx-qdigest.conf are generated.

--addr ADDRESS
Specifies the address or addresses for which to generate digests. Repeat the option to add multiple addresses. The syntax is the same as in the etc/digest-users configuration file.

If --addr is specified without --digest, then all configured digests for which the addresses are members are generated.

If --addr and --digest are both specified, then every specified digest is generated for every matching user, regardless of whether the users are subscribed to the digests.

For example, this generates all of's digests.

   pmx-qdigest --addr

This generates a spam digest for all subscribers:

   pmx-qdigest --digest spam

This generates a spam digest for, even if Mary is not subscribed to the spam digest:

   pmx-qdigest --addr --digest spam

By default, pmx-qdigest scans the quarantine for any messages that have not been scanned, accumulating digests for users. Then it sends mail based on the data. These options let you to restrict pmx-qdigest to a particular task only.

Print the contents of the state database in a human-readable format. The state database tracks what pmx-qdigest has already done by user and by digest.

Output is unsorted, because the database quickly gets too large to sort it fast. If sorted output is desired, you can sort or select the results you want using a tool like sort or grep:

   pmx-qdigest --dump | sort
   pmx-qdigest --dump | grep spam

The output is in four columns: user, digest, last_digest_id, and the time their last email was sent.

Output might look like this:

    @ spam 1149
    @ offensive 1149 spam 1154 1056771578 offensive 1154

The special @ user means ``any''. It is used to track the last id seen when not run with --addr or --min, and is the default for the --min option.

While scanning the quarantine, a message is added to a user's digest cache if the message's ID is higher than that user's last-digest-id. In the example database above, messages for would be skipped until the ID is 1155 or higher.

By default the digester does not ensure that the messages most recently added to the quarantine are included in the digest. Setting --index ensures that messages that have been quarantined since the last time pmx-qindex, was run are included in the digest.

By default the digester does not ensure that the messages most recently added to the quarantine are included in the digest. This option is no longer required, but present to ensure scripts using it are not broken.

Prints out usage information.

Configuration File

The pmx-qdigest program reads its configuration from etc/pmx-qdigest.conf. A simple configuration file might look like this:

 approve_addr = pmx-auto-approve
 date_format = us
 approve_tmpl = approve-failure.tmpl
 consolidate = none
 centralized = false
     expire = 5d
         template = digest-spam.tmpl
         # Messages quarantined for (only) the given reason(s) are included.
         members = digest-users

The approve_addr setting determines which email address is used as the ``Reply-To'' for generated digests. When users reply to their digests, the script listening at that address releases the requested messages.

The sample configuration file above specifies a single digest type (spam). Site administrators can specify additional digest types. However, it should be noted that unnecessary digest type entries need to be avoided. This is because they may result in more resource consumption during digest runs, regardless of whether quarantined messages corresponding to that digest type exist or not. For example, if the site is not quarantining offensive messages, then the offensive digest type entry should be removed from the configuration file. This is important mainly for digests operating in centralized mode.

The date_format setting specifies the format of the date used to expand the %%SINCE%% template variable. The default template uses this variable in the Subject line of the digest. The default is %b %d %H:%M. Other values include: us (mm-dd-yyyy, equivalent to %m-%d-%Y). uk (dd-mm-yyyy, equivalent to %d-%m-%Y), iso-8601 (yyyy-mm-dd, equivalent to %Y-%m-%d), or any conversion specifiers defined by the ANSI C standard (C89). Consult your system's strftime() manpage for further details.

For example:

 date_format = "%A, %B %d, %Y"

will generate a date where the weekday and the month are based on the pmx user's locale (e.g. mardi, mars 30, 2004 in a fr_FR locale).

The approve_tmpl setting specifies the template used to generate messages sent to end users when quarantine message approval requests fail.

The consolidate parameter can be set to merged or none (default). With consolidate set to none, quarantine digests are generated for each email account that has messages addressed to it in the quarantine. However, some users have multiple email accounts, and prefer that quarantined messages for all accounts be included in one quarantine digest. To enable this type of consolidation, set this option to merged.

Address mapping for consolidation is configured in the notifications file. Create an entry that maps all the addresses that you want consolidated into a single digest on the left side, and the address of the digest recipient on the right. Use spaces to separate multiple addresses. Use a colon to separate the addresses that you wish to consolidate from the digest recipient address. For example: jane_doe@*

The digest recipient specified on the right side must also be included (either explicitly or as part of a wildcard match) in the Quarantine Digest Users list for the consolidated digest to be sent.

The centralized setting controls the mode of digest operation. The default is ``false''. Set to ``true'' to enable centalized digest (see above for information on centralized and local quarantine operation modes).

Each subsection in the digest section defines a new type of digest. By default, only the spam digest is defined; it sends digests to those users who have mail quarantined for the specified reasons.

Digests are generated for the members of a digest. By default, this list is defined in etc/digest-users. Each line of the file corresponds to an email address. Wildcard characters are supported. To suppress mail to a particular address, precede the address with an exclamation mark (!). To mail digests to all users in a given domain, use the following format:


Note: Only users with messages in the quarantine receive digests.

Digest routing can be modified by changing the etc/notifications configuration file. Rather than always sending the digest to the member, pmx-qdigest first transforms addresses using the notifications map. This makes it possible to handle users who have several aliases.

Each time an digest is sent, a digest file is written to the var/digest/pending directory. The file's name is the digest's MD5 checksum, and the file contains the recipient of the digest and the message IDs that were included in the email. These files are used by the pmx-qdigest-approve script to release email to digest recipients.


This version of pmx-qdigest is faster and uses less memory than previous versions. Rather than accumulating the digest information in memory, digest cache files are accumulated in the var/digest/cache directory. Each address found while scanning the quarantine gets its own cache file. After pmx-qdigest has finished scanning, it processes the cache files, turning each into a digest email and storing it in the queue.

Per-user Digest ID

The last_digest_id counter is now tracked per-address, per-digest. This allows intelligent processing of requests. For example:

   pmx-qdigest --addr

generates the configured digests for which is a member, and it will update's digest counters so that the next scheduled digest omits the digested messages.

Digest Templates

Each digest is constructed using a template email message. This section describes the template format.

Here is a simple template:

 From: %%ADMIN_ADDR%%
 Subject: Quarantined spam messages %%SINCE%%
 MIME-Version: 1.0
 Date: %%SENT_DATE%%
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 8bit
 The following messages are spam. Delete the lines you don't want
 H:Id Time Score From Subject
 @[[[[[ @<<<< @]]]] @<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 V:id time reason from subjbody

There are several interesting features:

Template Variables
The %%ADMIN_ADDR%% and %%SINCE%% are template variables that are expanded when the template is parsed. The following template variables are available:
The administrator address, set in etc/pmx.conf.

The first time pmx-qdigest runs for this user, expands to the current date. For subsequent runs, this expands to the date of the last digest sent to the user. The format is specified by date_format in the configuration file.

The current date formatted in RFC-822 format for display in an email header.

Only available in the template's From header. This is intended to support mail clients that do not respect the Reply-To header.

Here is an example of how to set the From header to the dynamically-generated Reply-To header:

 From: %%REPLY_TO%%

A valid RFC 822 MIME boundary. Can be used to make multipart messages which are slightly less ugly than copying an actual MIME boundary.

For example:

 From: %%ADMIN_ADDR%%
 MIME-Version: 1.0
 Content-Type: multipart/alternative;
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 8bit
 plain-text part
 Content-Type: text/html; charset=UTF-8
 Content-Transfer-Encoding: 8bit
 <html>html-text part</html>

Currently, this expands to a fixed string, rather than being dynamically generated.

To support various data encodings, template variables also support lists of formatters. These are colon-separated words following the name of the template variable. For example, to html-encode the %%ADMIN_ADDR%% variable:


The following formatters are available:

HTML-encode the data.

Format the data in ALL UPPERCASE.

Format the data in all lowercase.

The data is transformed to ASCII. Any non-ASCII characters are replaced with a ``?'' character.

The email address portion of the data.

The email name portion of the data.

The data is treated as a mail header, and is encoded into UTF-8. This formatter does not need to be used with the from, subject, and subjbody digest fields as they are already UTF-8 encoded.

For a list of all built-in templates and formatters, see PureMessage::Template.

Digest Blocks
Digest blocks expand to the actual data in the quarantine, and are delimited by %{ and %}.
 line 1
 line 2
 line 3

Each line in the Digest Block is parsed separately. Blank lines are skipped. Other lines are parsed according to the following rules:

  1. Pre-Header

    This line is prefixed with P:, then emitted verbatim just before the digest table.


  2. Header

    The header line is emitted before the data. It is prefixed with H:, and it should have the same number of fields as the format specifies.


     H:Id Time Score From Subject

  3. Footer

    The footer line is emitted after all the data. It is prefixed with F:, and it should have the same number of fields as the format specifies.


     F:Id Time Score From Subject

  4. Separator

    The separator is emitted between each row in the table. It is prefixed with S: and emitted verbatim.



  5. Variables

    The name of the field variables to output, in order. Prefixed with V:, these are names of data fields that are substituted with the data for each row.


     V:id time reason from subjbody

    See Digest Templates for the list of known fields.

  6. Format

    The format line. A sequence of space-separated fields, some of which are expanded to contain the data.

    Any field that does not start with @ is emitted verbatim; otherwise the rest of the characters in the field describe the type of field.

    The different types of fields are:

    A left-justified fixed-width field. The width is the number of < plus one.

    A right-justified fixed-width field. The width is the number of > plus one.

    A centre-justified fixed-width field. The width is the number of | plus one.

    A left-justified expanding field. The minimum width is the number of [ plus one.

    A right-justified expanding field. The minimum width is the number of ] plus one.

    A centre-justified expanding field. The minimum width is the number of I plus one.


     @[[[[ @]]]]] @[[[[[ @<<<<<<< @<<<<<<<
Digest Fields
The following digests fields are available for using in the digest template:
The quarantine id of the message. This is extracted from replies by the auto-approve script.

A URL which, when clicked on, requests the release of the message. Should only be used in an HTML mail because it does not work in text clients.

The time the message was received.

The date the message was received.

The reason the message was quarantined. If the reason is ``spam'', the spam probability is displayed instead.

The message's From header. This field is always UTF-8 encoded.

The envelope sender of the message.

The envelope recipient to whom the message was addressed.

The subject of the message. This field is always UTF-8 encoded.

The subject, plus a few lines of the body if the subject is short. This field is always UTF-8 encoded.

The size of the message in ``human readable'' format; i.e. with a suffix of ``M'' for megabytes and ``K'' for kilobytes.

In addition, you can apply all the same formatters to the digest fields as for Template Variables. The most useful of these is html, for escaping data in HTML sections.

For example:

 V:release_href reason:html time:html from:html subjbody:html


the pmx-qdigest-approve manpage, the pmx-qdigest-expire manpage. the pmx-queue manpage


Copyright (C) 2000-2008 Sophos Group. All rights reserved. Sophos and PureMessage are trademarks of Sophos Plc and Sophos Group.