Difference between revisions of "Email spam"

From Cncz
Jump to: navigation, search
([Filtering op basis van de inhoud van de mail][Filtering based on the content of the mail])
(Niet bij de titel e van e-mail in de catefory, want het is al category e-mail)
Line 77: Line 77:

Revision as of 23:18, 9 April 2008


Blocking and filtering spam in electronic mail

Blocking on account of the address of the computer delivering mail

The blocklevel can be adjusted by every user via the Do It Yourself website. When an email arrives at the science mailserver the server instantly checks whether the mail address of the recipient exists and whether the recipient has a block level that includes the IP-number of the computer that tries to deliver the mail. Otherwise the mail is not accepted. If the mail would be blocked by a higher block level, the mail is accepted, but sent forward with a mail-header-line: "X-Would-Be-Blocked-By:".

There are four block levels: 'none', 'light', 'medium' and 'heavy', where 'medium' is the standard and recommended value.

  • none: no blocking takes place.
  • light: blocking by using the following lists:
  1. whitelist.sci.kun.nl: C&CZ's own white list: machines that are or have been listed on blacklists, but a lot of our users like to receive mail from. The list contains at the moment (luckily) few entries: a list server of Surfnet and the spamprovider of the UMCN.
  2. blacklist.sci.kun.nl: C&CZ's own black list: machines that are not (yet) a problem for others, but are for C&CZ. The list contains at the moment (luckily) few entries, mainly machines that have in the past bombarded C&CZ mailservers with spam and/or viruses.
  • medium: bovenop de light-lijsten komen:
    • bl.spamcop.net: A database with known and/or reported spammers per 'server IP-address.
    • sbl.spamhaus.org: A database of verified spam sources, spam gangs and spam support services, controlled by the Spamhaus Project team.

When one has a lower blocklevel than the standard 'medium' (so one has 'light' or 'none'), then mail that would be blocked by a 'medium' blocklevel is passed through with a warning sign "X-Would-Be-Blocked-By: Medium".

  • heavy: On top of the 'medium' lists the following is added:
  1. dnsbl.sorbs.net: SORBS A database of several sorts of spam sources.
  2. xbl.spamhaus.org: Spamhaus Exploits Block List: illegal exploits, incl. open proxies (HTTP, socks, AnalogX, wingate, etc), worms/viruses with buit-in spam engines [en andere typen van trojan-horse misbruik]and other types of 'trojan horse' abuse.]

Mail that would be blocked by a 'heavy' blocklevel, is passed through if one has a 'medium' blocklevel, but with a warning sign: "X-Would-Be-Blocked-By: Medium".

New logins automatically get the 'medium' blocklevel. This has a small risk that wanted mail will be blocked, but it clearly blocks more spam for most users than the 'light' blocklevel. Mail that would be blocked by a heavier blocklevel, is let through, but with a warning sign (X-Would-Be-Blocked-By:). Users themselves can see from mail that arrives with these warning signs, how much wanted mail would be blocked by a heavier blocklevel ("false positives"). The warning sign can also be used to sort incoming mail into folders, e.g. with Sieve.

When one forwards mail from other addresses automatically to the science address, the blocking of spam-sending computers should be done by the mailserver that originally receives the spam. Our mailserver doesn't see the spammer, it only has a connection with the forwarding mailserver.

Filtering based on the content of the mail

Despite the above way of blocking spam, one probably will still receive a lot of unwanted spam-mail. In that case one can only filter using the content of the mail. If one does that on the mailserver instead of on one's own computer, it of course will use more capacity than the spam blocking described above.

All C&CZ mailservers in ru.nl use MIMEDefang to filter the content of mail for spam and viruses. Mimedefang in turn uses SpamAssassin with central Bayes-filter.

The settings of SpamAssassin can be adjusting by a user by visiting the Do It Yourself website.

If one has chosen to have mail that is recognized by SpamAssassin as spam, to be delivered, then one can see in the final attachment a summary of the reasons why this mail was recognised as spam. A lot of different things count: words with a commercial or sexual meaning, properties of addresses an d header-lines, formatting, use of capitals, etc. Next to that, C&CZ maintains a statistic (Bayes) list of words that appear in normal mail and in spam. Each of these things contribute to the total 'spamscore' of the mail. If the score is more than 5.0, the mail is normally considered to be spam. For users of the new mailserver post.science.ru.nl mail which is marked as spam is automatically moved to the Spam imap folder on the mailserver. Using the Do It Yourself website it is possible to maintain a 'whitelist' of sender addresses and domains. Mail from these addresses and mail domains will never automatically be moved to the Spam folder, even if it is tagged by SpamAssassin as spam. With help of Sieve mail can be processed in other ways, just like users of the older servers could do that with a .procmailrc.