Large Scale Exim Installs
Working on a large, growing network, eventually you will find that a single Exim server just can’t do what you want it to. Some people will start cutting back on what thier server does, such as stopping virus scanning, or spam scanning. I urge you not to do this, it’s not good for the internet. This document has been written to show how multiple Exim servers can be used to provide a large, scalable email solution cabable of handling a very large number of emails. This is NOT a HOWTO, nor is it really a tutorial. I do not claim this is the solution, it is my solution. Parts of this document are ideas gained from a recent Exim course held by Philip Hazel at Cambridge University. Firstly, a diagram showing a network of MTAs. large image You might be wondering why I have seperated incoming and outgoing mail servers. My reason for this is mainly because I imagine in most setups, you will have quite different policies for incoming and outgoing mail. Whilst this would be possible to do on the same server, it makes sense to me to seperate them.
Incoming Mail
The mail relays job it to accept mail, scan it, and relay it on to the correct mail server. It does not deliver mail, nor does it queue it. More details on why these servers do not queue mail is explained below.
I personally do not agree with accepting an email, and then scanning it. I think this is bad for your users, remote users, and the internet on a whole. I’m not going to go into this here though. I use Exiscan for scanning emails for viruses and spam before even accepting a mail. Visit Exiscan’s website for more information on this. Exiscan allows you to specify multiple spamd and virus scanning hosts in your Exim configuration, so you can distribute the load of scanning incoming messages across multiple hosts. Using layer 4 switching and DNS round robin, the frontline mail relays are scalable horizontally. This is an important point to note, if your incoming mail load increases, you simply scale your virus/spam scanning hosts, and your mail relays sideways.
When Philip Hazel designed Exim, he decided that most mail on the modern internet is delivered immediatly. With this in mind, Exim was not designed with complex queuing systems. Because of this, fallback mail servers are a great idea to handle messages that cannot be immediatly delivered.
Fallbacks
Fallbacks mail servers who’s purpose in life is to handle message queues for mail which cannot, for whatever reason, be delivered immediately. Using Exim, it is possible to stack the fallbacks vertically, so that Fallback #1 handles mail for the first 6 hours, and then hands it to Fallback #2, which handles mail for the next 16 hours, and hands it back... and so on. Using this method, queued mail is handled much more efficently. Exim’s queue runners won’t be too busy handling mails that have been in the queue for several days to handle the relatively new mail.
Tuning
Obviously, when handling large mail loads, performance is critical.
DNS
Mail delivery requires a LOT of DNS lookups. DNS lookups are done to check RBL lists, and of course when accepting mail, lookups are done to validate server, and work out where the mail should go to. Because of this, a local and fast caching DNS server is very recommended. This server should have a large ammount of memory, and all mail servers should have a fast connection to it.
Hints Database
Exim stores a database to allow information to be shared with multiple Exim procesess. Because this datbase is accessed frequently, you get a performance gain by storing this on ramfs. Setting this up is outside of the scope of this document.
SpamAssassin
SpamAssassin will by default keep a database containing information about it’s seen messages. Unless you intend to be teaching your filters, this is not required. Also, keep in mind that the more rules you have, the more memory and CPU power you need.