Using SPAM scores

Introduction

The GatorLink Spam Detection System works by calculating the probability of an email being spam. Emails are given scores based on a set of rules. Each rule that matches a message adds or subtracts to the cumulative score given to that message.

Because each person's opinion on what is a desired email varies, the spam scoring system doesn't do anything beyond calculating the score and reporting that score to you. It is your responsibility to decide what action, if any, should be taken based on that score. Below we will explain how you can take advantage of this system, and then some examples will be given.

X-Spam Headers

When an email is scored, headers with the results of that scoring are added to the envelope. Headers describe the subject of an email, who it is from and to, where the email came from, when it was sent and when it arrived, and lots of other things. If you view the headers of a scored email you will see one or two headers:

X-Spam-Status

This header contains a few things, the most interesting is the calculated score in numerical form and the list of tests that matched for this email. While this header is educational and useful for understanding the system and why decisions were made, it is not very useful for automated filtering.

X-Spam-Level

If the email's spam score is a score of 1 or more, this header will be added. This header shows the score as a series of asterisks kind of like a bar on a bar graph. For example if the calculated score is 4.7 this header will look like: X-Spam-Level: **** where the number of asterisks is the score rounded down a whole number. What this header lacks in detail, it makes up in utility when it comes to writing filter rules in your mail client.

Creating a Rule

Now the fun part: creating a rule in your mail client to help you manage spam. There are lots of ways to do this and each mail client does it slightly differently. In this section I will give generic directions that describe what you should be thinking when you create a rule.

The first step in creating a rule to process emails that are probably spam is to decide what your tolerance levels are. Ask yourself the following questions:

1. How important is your email to you?
2. What type of email do you receive?
3. How much trust do you place in the system and the people who maintain it?
1.

How important is your email to you?

No spam scoring system is perfect. While they are frequently right, they are sometimes wrong. An email from your professor is not something you want to lose, nor do you want to lose an email receipt from an online purchase.

2.

What type of email do you receive?

Some people only want email that is from other individuals as opposed to mass mailings. Other people like mass mailings such as joke of the day type emails. If you don't want a funny email forward then you can be more strict with your spam rules. If you enjoy joke of the day or online newsletters you'll need to be less strict with your spam rules.

3.

How much trust do you place in the system and the people who maintain it?

We will be updating the system over time. Just because an email sent today got one score, doesn't mean it would get the same score if sent 6 months from now. We will make a best effort to make sure scoring is consistent with people's expectations, but it isn't a hard science.

Those questions boil down to: How inconvenient is it for me when an email is classified incorrectly?

The next step is to decide what threshold you want to trigger your rule. A conservative threshold of 7 will let a lot of spam though, but there is a low risk of legitimate email being classified as spam. I suggest you take a look at the headers of message in your inbox and see what the highest scored email that you consider legitimate is. Then add one or two to the score and use that as your threshold. For example, in my mailbox, the highest scored email that I want has a score of 4.1. Because I'm a conservative person, I'd use a threshold of 6 until I get more confident with the system.

Now that you have a threshold that you're comfortable with, you need to put the rule in your mail client. Each mail client is different and in this section you're on your own for the details. Consult your mail client's help if you get stuck. Look though your mail client's preferences or menus for Message Filters, Rules, Rules Wizard or something similar.

Once you've found the rule editor interface, you're going to need to create a rule that uses a custom header so you can type in the header name of X-Spam-Level. After you've typed X-Spam-Level in the header name field, enter as many asterisks in header value field as your chosen threshold. For the example of a threshold of 7, I would type seven asterisks: ******* . Your mail client may allow you to adjust how the value is matched. If it does, I suggest you pick the "contains" or "substring" match type. If you pick the "exact match" or "is" match type, you will only filter messages that score. Using the "contains" match method will match scores at your threshold or above. Think about it. A score of 9 would have nine consecutive asterisks and a pattern of seven consecutive asterisks would match seven of those nine.

Finally you need to configure what action should be taken when this rule matches. If you're using IMAP you may want to move matched messages to a junk folder, just remember to check that folder every so often to delete the build up of messages else you'll run out of space in your mailbox. You could also change the color of the message as it's listed in the message list.

Final Thoughts

There are other ways to deal with spam and you can get more complicated. You may want to set up more than one rule so that messages with a very high score, say 12, get deleted immediately and scores between 7 and 12 are flagged some how. More complicated filtering is beyond the scope of this document.

You may wish to ignore the spam scoring the GatorLink SMTP servers do and use a more personalized solution such as the trainable filter in the freely available Mozilla Thunderbird, or you can use both.

Be careful when using these new tools available to you. We have worked very hard to set up a safe and robust system for scoring email. If you take advantage of this system, it is your responsibility to use it carefully. We will not be held responsible for lost email because you failed to understand how to use the tool. If you start automatically deleting messages, odds are it will be deleted before the next backup window. There is nothing we can do recover a lost message.