building a large-capacity web-based email system
Wangbo
In recent years, web-based free email systems have been very popular. At present, several well-known free email sites have largely become the choice of most people, the establishment of a purely free email service site is no longer as popular as before, but the provision of web interface email service has become a commercial site for its registered members to provide one of the basic services.
An email system can be divided into server-side and client, the Web interface of the email system is to put the email client on the Web server side, so the email system needs to implement a web interface for email customers. However, since this email system requires a large number of users, there are specific requirements for the email server.
Operating system and user databases
Because of the high requirements of the operating system and database for the large-capacity email system, choosing the right operating system and database is the most basic problem.
Because of the high stability and performance required to provide web and email services, UNIX is generally used as the server's operating system, such as Hotmail using FreeBSD and Solaris, and 163 of domestic sites are also using the BSD series. However, Unix's standard email system is not suitable for this high-volume service. Some UNIX systems, such as the current version of Linux, have a user ID of only 16 bits, so the number of users can only be up to 64k, even though the UNIX system itself supports 32-bit user identities, and for performance reasons, the number of users supported by a single server should not exceed 100,000. In order to support the scalability of more users, the general use of multiple servers at the same time to provide services, although the standard UNIX users can still use as an email user, but for security, performance and manageability, the general use of Non-UNIX system users as an email user. While saving user data usually adopts the form of database which supports network access, it is commonly used to have LDAP, standard database, and user database that the email system implements. Among them, LDAP is to provide the standard of directory service, so it should be the best choice, its common open source code is implemented as OPENLDAP, and the standard database is easy to implement and extensible, which is most commonly used on the internet as MySQL; In addition, it is implemented in other ways.
Save the message
For large-capacity email systems, the most critical technology is how to deal with the problem of mail storage, the way to improve storage efficiency, will determine the success of the email system.
Because of the large number of users, how to save the user's mail is a very important issue. Traditional UNIX uses a single directory to hold all users ' messages, greatly reducing the performance of the file system when the number of users is large. Only using a multi-level directory, the number of files in each directory is limited, can reduce the system consumption when opening files, or no longer use simple files to save the message, and in one form of encapsulation. Completely take the database form to save the message, because the user mail operation is more file operation, and size change is large, it will result in performance and storage space on the large waste.
Because of the large number of users, and also required to be able to be accessed by multiple servers at the same time, you must use a large storage space server or server cluster to save, through Fibre Channel or network File system NFS to share storage space, so that each user's message storage path for each server is consistent. Fibre Channel is a very expensive solution and is more commonly used with NFS, using dedicated NFS servers, such as NetApp, or using a PC UNIX server with RAID capability.
When using NFS to share storage space, you must pay attention to a very important issue: Because NFS lacks the file locking mechanism, when using the traditional user mail storage format mailbox, because all the messages are saved in the same file, so the mail operation must be locked to ensure that there is no access to the conflict, This makes it unsuitable for NFS storage. In order to solve this problem, QMail proposed Maildir storage, each message as a separate file saved in the user's personal mail directory, to avoid the locking. Therefore, the common free mail server, generally uses the Maildir way to save the user's mail.
If you do not intend to use the shared file system to save users ' messages, but want each server to access only the user mail on its own hard disk storage space, the email server and client need to be customized so that they can find the user's real server through the user name. Give the access task to this server to complete. The shortcomings of this method, in addition to the need for large changes, the system structure is complex, but also because the server is divided by users, is not conducive to load sharing. The advantage is that it does not access other servers over the network, so you can use any of the message storage formats, including a powerful Cyrus system to store messages and provide services.
Mail Server Software
The use of what kind of email server software will ultimately affect the performance of the system, their own to do a set of email server may outweigh the loss, there are now two options: SendMail and QMail.
Standard email software, such as SendMail, also provides some methods, including aliases, to support non-UNIX system users, but these capabilities are not enough to implement this email system. In order to support these email users, you must use your own email server software. But since the existing email software are quite mature, but also are open source software, so the usual practice is to modify the original email software, such as SendMail, qmail, etc., so that it support specific email users. Completely rewrite an email service software, from the maturity, stability is not advisable.