Mail 101 -- How Mail to UW Addresses is Filtered

This is an overview of what happens in terms of filtering when someone sends a message to any email address on campus.

userid@uwaterloo.ca

If the incoming message is addressed to userid@uwaterloo.ca, it is handled by the mailservices cluster (it used to be handled by a machine called ego, which was barely keeping up with traffic when there weren't major virus attacks in the wild).

Independent of configuration put in place by the system administrator on the mail server where the recipient's mail will wind up, the cluster decides whether to accept or reject the incoming message based on:

whether the sending machine is on a known and respected blacklist (likely undesirable mail) and the recipient has not elected to receive all mail regardless of blacklisting by third parties (ego and most large mail servers on campus have done this for well over a year)
whether the message itself contains an attachment which is on Microsoft's list of dangerous attachments or certain attachment names known to be used by viruses (ego has done this for several months)
whether the message or any of its attachments contain a known virus signature (this is new behaviour as of February 2004)

Rejected Messages

Any rejection at this point returns to the sender a line of text explaining the reason for the rejection and, in the case of anti-spam blacklists, the URL for a web page where the sender can read about UW's use of third party anti-spam blacklists and find a link allowing them to send mail for several days. This link is used periodically.

The same web page contains an explanation for on-campus mail recipients and a link to a web page where the user can choose to receive all incoming messages regardless of whether the sending machine has been blacklisted.

Senders and recipients may not opt out of filtering for potentially dangerous executable content or virus scanning. If legitimate mail is being rejected with an error message indicating either

A
suspicious file (executable code) was found in the message

or Virus detected by ClamAV, there is a problem with one or more message attachments. To avoid such problems, senders of rejected messages may wish to use means other than email to provide the data being sent or use a (different) encoding mechanism to send their attachments.

userid@mailhost.uwaterloo.ca

The same filtering and accept/reject decision is performed for all mail being handled by mailservices on behalf of a server (currently all mail to admmail, ist and watserv1 as well as the workstations used by members of the mail filtering project team).

For systems using xhiered versions of sendmail, the blacklisting and filtering of executable content have been options which the system administrator may choose to enable or disable.

Messages addressed to recipients other than userid@uwaterloo.ca or those on a machine whose mail is being handled by mailservices will be rejected for anti-spam blacklists, executable content or viruses only if the system administrator has allowed those checks to be enabled. In xhiered versions of sendmail, anti-spam blacklists identical to those used by mailservices and filtering for executable content are enabled by default. In some areas, anti-virus filtering has been enabled as well.

SpamAssassin

SpamAssassin processing is not performed during message receipt, but instead as part of delivery to a message's final destination. Mail may be filtered by a system administrator or by the recipient based on the results of SpamAssassin tagging a message as likely spam or non-spam.

SpamAssassin is something an individual or a system administrator can choose to use or not use. If an individual does not want his mail to be tagged by SpamAssassin in spite of the sysadmin configuring the mail server to use SpamAssassin, that individual can set up a .forward file to ensure none of his incoming messages are processed through procmail.

SpamAssassin is not invoked during the SMTP conversation because at that point it cannot know to what mailbox(es) the message will eventually end up being delivered, which means it cannot honour an individual's preferences including whitelists and blacklists at that point. The only time SpamAssassin runs is when a message is being delivered to its final destination. No mail is automatically rejected or thrown away by SpamAssassin itself, though some people may choose to systematically destroy all incoming messages tagged as spam.

Because of the significant CPU and memory load SpamAssassin puts on a system, the mailservices cluster is running an instance of spamd (the workhorse that does the analysis and tagging of mail messages) which may be used by any machine on campus. Since SpamAssassin can't reach out over the network and read users' files on their mail servers, it uses a locally stored database of user preferences when run on the mailservices cluster.

Because IST doesn't have infinite resources, the latest version of SpamAssassin as available through xhier is very stripped down. Because IST staff have heard repeated complaints about the significant burden placed on machines running SpamAssassin, the latest version available through xhier by default takes that burden from the faculty, department or workgroup mail server and places it where there are resources available. There is nothing preventing system administrators from using the latest available version of SpamAssassin, xhiered or otherwise, directly on mail servers in their care using whatever configuration best suits the needs of their user community.