At work encountered hundreds of machines to a clustered environment (about 10 machines) upload files of the scene, think of using LVS to do the FTP server load balancing.
However, in the NAT mode, the request and response packets go through the director server, and the director server becomes the bottleneck in the high load scenario. Although the DR Mode and the Tun mode response message is not through the director server, but the large data volume of the file upload task mainly, the bottleneck is mainly the request message brought by the large flow, how can the request message in the large flow of scattered to the nodes of the cluster?
Think of using the FTP passive mode of the data link IP address jump function, for vsftp the corresponding parameter pasv_address. Configure the address as the IP address of the real server for the LVS cluster (the native address of the FTP server). This allows the FTP client to connect to the data link using the IP address of the real server after connecting to the control link via the 21 port of the LVS-based director server.
Because the data link does not go through the LVS of the director server, but directly through the data link link the nodes of each cluster FTP server, so regardless of the upload download task can achieve the purpose of load balancing.
Note: Because VSFTP checks for inconsistent destination IP addresses in Data link and control links, it returns an RST message to the FTP client directly to terminate the connection to the data link, which can be achieved by iptables modifying the data link to the real The destination IP in the server's message is the director server for the purpose of dodging the checksum:
iptables -t nat -A PREROUTING -p tcp -dport ${pasv_min_port}:${pasv_max_port} -j DNAT --to_destination ${VIP}
LVS to achieve load balancing of FTP upload traffic