The Mail system is an important component of Linux network applications. A complete Mail system consists of three parts: the underlying operating system Linux Operation), Mail Transport Agent, and MTA ), mail Distribution proxy Mail Delivery Agent, MDA), Mail User proxy Mail User Agent, MUA ).
Postfix is an excellent MTA, which is well known for its efficient and secure features. Postfix is the best MTAs in Anti-Spam or Anti-UCE in UNIX, many companies even develop anti-spam gateways Based on the Postfix code. The Anti-Spam function of MTA is to filter sessions during MTA processing. This filter not only filters the Spam Sent to itself, but also prevents itself from being maliciously used to send spam. Postfix implements all major MTA filtering technologies. Postfix is Wietse Venema inIBM(MTA mail Transmission proxy) software developed under the GPL protocol. Compared with Sendmail, Postfix is faster, easier to manage, more flexible, and more secure, while maintaining sufficient compatibility with sendmail. Comparison between Sendmail and Postfix is shown in table 1.
Table 1 Comparison Between Sendmail and Postfix
MTA |
Maturity |
Security |
Features |
Performance |
Sendmail compatibility |
Modular Design |
Postfix |
Medium |
Medium |
Medium |
Medium |
Supported |
Yes |
Sendmail |
High |
Low |
Medium |
Low |
|
No |
SPAM is also called UCE (Unsolicited your cial Email, an unauthorized Commercial Email) or UBE (Unsolicited Bulk Email, a large number of unauthorized emails ). Internet Society of China provides a formal definition of spam. Any email that meets one of the following four can be called spam:
1) the recipient has not submitted a request or agreed to receive any promotional emails such as advertisements, electronic publications, and various forms of publicity materials.
2) emails that cannot be rejected by the recipient.
3) emails that hide the sender's identity, address, title, and other information ;.
4) emails containing false information sources, senders, routes, and other information.
1) SMTP User Authentication
Currently, the common and effective method is to perform SMTP authentication on the Mail proxy Mail Transport Agent (MTA) for Mail users from the internet outside the local network, only Authenticated Users are allowed to perform remote forwarding. This not only effectively avoids the use of the email sending proxy server for spam senders, but also facilitates employees who work on a business trip or at home. If SMTP authentication is not adopted, it is feasible to set up an Internet-Oriented Web Mail Gateway without sacrificing security. In addition, if the SMTP service and POP3 Service are integrated on the same server, it is safer to perform POP3 access verification for POP before SMTP before the user attempts to send a mail, however, there are not many email client programs that currently support this authentication method in the application.
2) reverse name resolution
No matter which type of authentication, the purpose is to prevent the mail sending proxy server from being used by spammers, but it is still helpless for the Spam Sent to the local. To solve this problem, the simplest and most effective method is to reverse name the sender's IP address. The DNS query is used to determine whether the sender's IP address is the same as the name it claims. For example, if the sender's name is mx.hotmail.com and its connection address is 255.200.200.200, it is rejected if it does not match the DNS record. This method can effectively filter out spam from dynamic IP addresses. For some senders using dynamic domain names, it can also be blocked based on the actual situation. However, the above method is still ineffective for spam via Open Relay. In this regard, the further technique is to assume that valid users only use the email sending proxy server with a valid Internet name in the domain to send emails. For example, if the sender's mail address is a someone@yahoo.com, the Internet Name of the mail proxy server it uses should have a suffix of yahoo.com. This restriction does not comply with the SMTP protocol, but is effective in most cases. It should be noted that reverse name resolution requires a large number of DNS queries.
3) Real-time blacklist Filtering
The preceding preventive measures are still invalid for spams using their own valid domain names. The more effective method is to use the blacklist service. The blacklist service is a database composed of domain names or IP addresses established based on user complaints and samples. The most famous database is RBL, DCC, and Razor, these databases store host names or IP addresses that frequently send spam messages for the MTA to perform real-time queries to determine whether to reject the emails. However, it is difficult to ensure the correctness and timeliness of various blacklist databases. For example, RBL and DCC in North America contain a large number of host names and IP addresses in China, some of which are caused by early Open Relay and some by false positives. However, these latencies have not been corrected, which has hindered China's contact with emails in North America to some extent and the use of these blacklist services by Chinese users. See figure 1.
Figure the process of filtering spam mails by using PCIe
4) Content Filtering
Even with the technology in the previous sections, there will still be a considerable number of spam leaks. In this case, the most effective method is to filter the content based on the mail title or body. A simple method is to use the content scanning engine to filter information such as the common title language of spam, the name, phone number, and Web address of the spam beneficiaries. The more complex but more intelligent method is the content filtering based on Bayesian probability theory. This algorithm was first proposed by Paul Graham for http://www.paulgraham.com/spam.html ), and use the self-designed Arc language for implementation. The theoretical basis of this method is to analyze the common keywords in a large number of spam mails to obtain the statistical model of their distribution, and then calculate the possibility that the target email is a spam email. This method has certain adaptive and self-learning capabilities and has been widely used. Spamassassin is the most famous spam content filtering method. It is implemented in Perl and integrates the above two filtering methods, which can be integrated with various mainstream MTA. Content filtering is the most computing resource-consuming among all of the above methods. When the mail traffic is large, it must be used with high-performance servers.