OpenVPN multi-processing-netns container and iptablesCLUSTER
If you still indulge in the previous results and the imposed sighs, do not meditate and sublimate them.
Ip netns add vpn1
Ip link add veth0_vpn1 type veth peer name veth_vpn1
Then, veth0_vpn1 is assigned to vpn1, and veth0_vpn2 is assigned to vpn2.
Ip link set veth0_vpn1 netns vpn1
Connect veth_vpn1, veth_vpn2, and eth0 together:
Brctl addbr br0
Now, run OpenVPN in the two namespaces:
Ip netns exec vpn1 ifconfig veth0_vpn1 192.168.1.1/24
The two openvpn processes read the same configuration file, but their networks are isolated. We can see that the veth Nic addresses of the two namespaces are exactly the same. In fact, for network configuration, vpn1 and vpn2 are exactly the same. Since the veth peer of the two namespaces has been connected with eth0, the next question is how to distribute the data packets to the two namespaces. In this case, the CLUSTER target of iptables is used to help. The current system schematic diagram is as follows:
3. The problem with building a distributed Cluster is how to distribute data packets to the two namespaces (multiple in the actual environment, depending on the number of CPUs. Is it necessary to create something like LVS at the Bridge level? The idea is correct, but I will not do that, because it is better to run multiple services directly on LVS without namespaces. In fact, the reason for namespace is that iptables provides a distributed Cluster load balancing algorithm model, convert the centralized decision "Which node handles the data packet" to the distributed "whether the data packet is handled by me ". That is to say, computing is distributed, and it is no longer at one point. For specific ideas, see my other article and the addressing philosophy of Full-broadcast Ethernet in the early days.
Iptables-a input-p udp -- dport 1194-j CLUSTERIP -- new -- hashmode sourceip -- clustermac 01: 00: 5e: 00: 00: 20 -- total-nodes 2 -- local-node 1
It is not important for CLUSTER target to map the hash value to node-num. What is important is that it can indeed map the hash value from a stream to 1 ~ One of total-nodes and only maps to that one. This fixed ing ensures that a data stream is always processed by the same namespace. The figure below shows:
Forums, blogs, etc., but the depth of knowledge is more self-developed or enlightened. For the latter, it is the ability. A simple example is the impact of a large number of TIME_WAIT status sockets on the system on the TCP server. People have proposed many solutions, such as setting recycle and reuse, which are all the same, yes, this can solve the problem. But does anyone mine the problem caused by TIME_WAIT? There are also a large number of TW sockets, where the number of sockets exceeds several orders of magnitude of ESTABLISH. The general answer is that the system resources are used up and socket resources are used up. However, in today's era of server hardware competition, there are hundreds of GB of memory at will, which is not a problem. Does anyone change their thinking from space overhead to time overhead? Yes, but few. Why? It may be because they always think it is more cost-effective to upgrade the memory than to upgrade the CPU. In fact, almost everyone knows that the data structure organization not only affects the memory usage, but also the operating efficiency. You only need to think about the implementation of TCP socket to understand the following fact: a data packet corresponds to a socket and requires a look-up process. For TCP, first, check whether the data packet has a corresponding socket. If it is not found, check whether there is a corresponding listen socket (otherwise, what should we do ?). That is to say, the listen socket search is done at the end. For a new connection, it cannot be found in the socket linked list in the ESTABLISH State first. If there are not many ESTABLISH sockets, this overhead can be ignored. Even if there are many, it must also be routine, therefore, this search is an inevitable overhead, but we need to check whether it corresponds to a TIME_WAIT socket. If there are a large number of TW sockets, this overhead is an additional overhead, yes, but there is a premise that we must avoid the problems brought about by TW (we must understand all aspects of this mechanism before canceling a mechanism ). Therefore, a large number of TW sockets not only consume space, but also reduce the efficiency of new connections. A large amount of time will be spent on table queries, the data transmission efficiency of established connections is not significant, because it has found an ESTABLISH socket before querying the TW status socket. If you understand the implementation of the TCP layer in Linux, the above problems are easy to analyze. However, if you have never seen the implementation of the source code, you need to think about it yourself. It is true that familiarity with interfaces without attention to details can improve the coding efficiency, but this is not a rumor, because familiarity with implementation details can improve the troubleshooting efficiency after a problem occurs. Therefore, the depth and breadth of knowledge are indispensable. The key is what stage you are.