In fact, in many small companies and small enterprises, especially websites involving e-commerce and e-advertising, their websites also require high-availability Linux clusters for load balancing, however, due to cost constraints, the boss will require the system architect to design a solution that can meet this requirement with the minimum amount of money... information &
In fact, in many small companies and small enterprises, especially websites involving e-commerce and e-advertising, their websites also require high-availability Linux clusters for load balancing, however, due to cost constraints, the boss will require the system architect to design a solution that can meet this requirement with the minimum amount of money. how should we implement this requirement as a system architect?
The first is the choice of the data center. if the company has its own data center, it would be the best. if it does not have its own data center, I suggest you put it in the BGP data center for hosting. if you have any choice, it is best to select a data center with a hardware firewall to ensure security. In addition, how do we choose servers? With the highly available cluster environment of server load balancer, we can assemble servers by ourselves, which is the most cost-effective. Although the quality of branded servers such as IBM and DELL is guaranteed, the price is often unacceptable to the boss. Of course, everything is based on stability.
First, select the server load balancer device. We can have two options: one is through hardware. common hardware includes expensive NetScaler, F5, Radware, Array, and other commercial load balancers, it has the advantage of a professional maintenance team to maintain these services. The disadvantage is that the cost is too large. Therefore, it is not required for small network services for the time being; in addition, it is similar to LVS/HAProxy and Nginx's Linux-based open-source free server load balancer software policies, which are implemented at the software level, so the cost is very low, small companies and small enterprises are the first to choose software-level server load balancer due to cost issues.
As for the high-availability architecture of server load balancer, I first introduced the Nginx/HAProxy + Keepalived architecture. at this moment, many friends may wonder why you didn't choose the cluster solution based on LVS + Keepalived? This is because the websites we deploy generally require static/dynamic separation and regular distribution. if we choose LVS + Keepliaved architecture at the beginning, then, we should add at least one layer of server load balancer in the middle, which will consume more machines and will inevitably increase the cost of the entire website. In addition, many friends are worried about a problem, the stability of Nginx/HAProxy + Keepalived is not as good as that of LVS + Keepalived. this is actually a misunderstanding. we have successfully implemented a dozen projects, plus several years of observation periods, it is found that the stability of these software-level load balancers is indeed good, and there is little possibility of downtime in high concurrency, while a commercial website recently implemented uses HAProxy + Keepalived, with hundreds of millions of concurrent traffic per day, HAProxy remains rock-solid. LVS has the best performance in terms of performance, especially when the number of subsequent nodes (such as Web or MySQL database servers) exceeds 10. In small companies, the concurrency and traffic are generally not very large, and may last between 1 million/day in a day. so I also recommend Nginx/HAProxy + Keepalived here.
If the website is hosted in the IDC and there is no hardware firewall at the very beginning of the IDC, we should do our best to monitor traffic at this time, I usually install the MRTG + Nload software on the primary Nginx/HAProxy to monitor the traffic. Nload can monitor the traffic in real time. the installation is also very simple. we have installed RPMForege first. RPMforge is a software warehouse under the Centos system and has over 4000 software packages. it is considered by the Centos community as the safest and most stable software warehouse at http://pkgs.repoforge.org/rmpforge-release. After the installation is successful, we can easily install nload by running the command yum-y install nload. After you enter this command, the nload software intuitively displays the real-time status of the traffic. the top half of ncoming is the traffic that enters the NIC, and the lower half is the traffic that Outgoing is from the NIC, each part includes the current traffic (Curr), average traffic (Avg), minimum traffic (Min), maximum traffic (Max), and Total traffic (Ttl ). Because the software is intuitive, I use it to replace the previous real-time traffic monitoring software iptraf. the Nload working interface is as follows:
Many friends who are interested in clusters often ask me, if the website wants to deploy a high-availability Linux cluster solution for server load balancer, and the company wants to implement it in the most cost-effective way, how many servers are needed? My answer is four, namely the 2 + 2 architecture. The first is two Nginx/HAProxy + Keeplaived machines, and the second is the Web machine with better configuration, the MySQL database is deployed on two Web machines in a master-slave manner. the monitored Nagios is deployed on the Nginx/HAProxy machine, and the traffic monitoring is usually deployed on the master Nginx/HAProxy, the software adopts the MRTG + Nload method. I use the rsync + inotify method for data synchronization between servers. of course, more often, I use the pure rsync method, this avoids frequent disk reading when there are changes to large files on the website. of course, if your company has higher requirements on the file server (such as the file type ), we can consider adding another two servers to achieve the DRBD + Heartbeat + NFS method. if there are massive files to be stored, we can consider using MFS. of course, this is also a machine-consuming method.
Similar to the above small company cluster architecture, how do we solve the session synchronization problem? We can use the Nginx ip_hash and HAProxy balance source mechanisms. their principles are similar, A client can only access a fixed backend Web server for a long period of time, so that the session will be maintained, when we perform login on the website page, we will not jump between two Web servers. Naturally, the website will not remind you that you have not logged on after logging on once, if you need to log on again, you can use memcached for large projects or websites.
In addition, we have at least two options for small company Web servers: Apache and Nginx. in environments with low traffic and concurrency, we can choose Apache as our Web server. although its anti-concurrency capability is not high, its stability is the best. many of my e-commerce websites are based on Apache; in a large-traffic and high-concurrency environment, I prefer Nginx.
MySQL is designed with one master and one slave. many friends think this design is simple, but it turns out to be the most stable. My e-commerce website also adopts this architecture. in the past few years, I have never lost my tickets due to database faults. in the early stage of website launch, we can use the PHP program, select the Slave machine as the portal of the background query function, which can greatly reduce the pressure on the primary database. In addition, MySQL machine does not only act as one backup and backup machine, we use the PHP program to transfer the complex query in the background to the MySQL machine. Of course, the monitoring of the master-slave replication status of MySQL is also very important. I usually use the Nagios and SHELL scripts for dual monitoring.
It is also a task of system administrators/architects to help enterprises save and save money. I hope you can understand this in your work.
This article is from the blog "Fuqin liquor cooking ".