Linux Nic binding

Source: Internet
Author: User

2013-08-20 15:39:31

Many servers now come with dual gigabit network ports. Using Nic binding can increase network bandwidth and make corresponding redundancy. Currently, it is used in many scenarios. In the NIC binding mode of the linux operating system, the Linux bonding Driver provides a way to bind multiple network interface devices to a single network interface setting for network load balancing and network redundancy. Of course, Nic vendors now have some network card binding software for the windows operating system NIC Management (for windows operating systems, the NIC binding function is not required by a third party ).

Our company is a distributed file system, and many projects use Nic binding to improve performance. I have found a lot of information on the network and conducted a lot of tests. Next I will talk about my opinion on Nic binding.

BondingApplication

Network Load Balancing

For bonding network load balancing, we often use it on the file server. For example, we use three NICs as one to solve the problem of an IP address, heavy traffic, and heavy network pressure on the server. In the Intranet, most file servers use the same IP address for ease of management and application. For a M local network, when multiple users use the file server at the same time, the network pressure is great. To solve the same IP address and break through the traffic restrictions, after all, the network cable and network card have limits on the data throughput. The best way to achieve network load balancing with limited resources is bonding.

Network Redundancy

For servers, the stability of network devices is also important, especially the network card. Most hardware redundancy is used to provide server reliability and security, such as power supply. Bonding can also provide support for Nic redundancy. Bind a network adapter to an IP address. When a network adapter is physically damaged, the other network adapter can also provide normal services.

BondingPrinciple

What is bonding should begin with the promisc mode of the NIC. We know that, under normal circumstances, the network adapter only receives the target hardware Address (MAC Address) as its own Mac Ethernet frame, and filters out all other data frames to reduce the burden on the driver. However, the NIC also supports another promisc mode, which can be used to receive all frames on the network. For example, tcpdump runs in this mode. Bonding also runs in this mode, and modifies the mac address in the driver, changing the Mac address of the two NICs to the same, can receive data frames of a specific mac. Then, the data frame is sent to the bond driver for processing.

Bonding.

Linux has seven Nic binding modes:

Mode code

Mode name

Mode

Description

0

(Balance-rr) Round-robin policy

Round Robin Policy

This policy transmits data packets sequentially until the last device. This mode provides load balancing and fault tolerance capabilities.

1

(Active-backup) Active-backup policy

Master/Slave Policy

Only one device is active in this policy. One device goes down and the other immediately changes from backup to the master device. The mac address is externally visible. This mode provides Fault Tolerance capabilities.

2

(Balance-xor)

Exception or policy

This policy selects the transmission device based on the MAC address's exclusive or operation results to provide load balancing and fault tolerance capabilities.

3

Broadcast policy

Broadcast Policy

This policy transmits all data packets to all interfaces and transmits all data through all devices to provide Fault Tolerance capabilities.

4

(802.3ad) IEEE 802.3ad Dynamic link aggregation

Dynamic Link Aggregation

This policy shares the same transmission speed by creating an aggregation group. The switch must also support the 802.3ad mode to provide fault tolerance.

5

(Balance-tlb) Adaptive transmit load balancing

Adapter transmission Load Balancing

This policy distributes the sent data to each device based on the current load and processes the received data by the currently used device. The channel combination in this policy does not require dedicated switch support, and provides load balancing and fault tolerance capabilities.

 

6

(Balance-alb) Adaptive load balancing

Adapter Load Balancing

In IPV4, this policy includes the load balancing policy transmitted by the adapter. The load received by ARP negotiation is completed. The channel and the driver intercept the requests sent by ARP in the local system, overwrite the original address of the sub-device with the hardware address of one of the devices.

 

First:Bond0: round robin

Standard Document Description

Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

Features

  • Server Load balancer--- All links are in the Server Load balancer status. The polling method sends packets to each link and sends packets based on per packet. The sequence of transmitted data packets is Sequential transmission (that is, the first 1st packets go through eth0, And the next packet goes through eth1...... until the last transmission is completed ). Ping An address on a machine bound with a dual-nic. You will find that both NICs have traffic. Load to the two links, which indicates that the round-robin transmission is based on the per packet method.
  • Fault Tolerance--- The features of this mode increase bandwidth and support fault tolerance. When there is a link problem, the traffic will be switched to the normal link.
  • Performance problems--- If a connection or session data packet is sent from different interfaces and then goes through different links in the middle, the client may encounter unordered data packets, unordered packets need to be sent again, so the network throughput will decrease. Bond0 is not ideal for performance growth under heavy network transmission.
  • Vswitch support--- In this mode, the IP addresses of all bound NICs are changed to the same MAC address. When the switch receives a packet sent to the MAC address, it does not know which port to forward the data from. To solve this problem, the switch should bind a port, data is sent to the logical aggregation port, which then forwards data from multiple ports.

 

Second:Bond1: active-backup

Standard Document Description

Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. the bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. this mode provides fault tolerance. the primary option affects the behavior of this mode.

Features

  • Fault Tolerance--- Only one slave is active ). That is to say, at the same time, only one network adapter is in the working state, and all other slave instances are in the backup state. Only when the active slave fails will it become active ). In bonding 2.6.2 and later versions, if a failover occurs in active-backup mode ], bonding sends one or more ARP requests on the new slave. One ARP request is for the bonding master interface and each VLAN Interface configured above, this ensures that at least one IP address is configured for this interface. The corresponding VLAN id is assigned to ARP requests for VLAN interfaces.
  • No Load Balancing--- The advantage of this algorithm is that it can provide high network connection availability, but its resource utilization is low. Only one interface is in the working state, and with N network interfaces, resource utilization is 1/N.
  • No vswitch support required--- The MAC address is externally visible. From the outside, bond's MAC address is unique to avoid confusion between switches.

Third:Bond2: load balancing (xor)

Standard Document Description

XOR policy: Transmit based on [(source MAC address XOR 'd with destination MAC address) modulo slave count]. this selects the same slave for each destination MAC address. this mode provides load balancing and fault tolerance.

Features

  • Load Balancing and fault tolerance--- Transmit data packets based on the specified HASH policy. The default policy is (source MAC address XOR target MAC address) % slave number. Other transmission policies can be specified using the xmit_hash_policy option.
  • Performance problems--- This mode limits the traffic to ensure that the traffic destined for a specific peer end is always sent from the same interface. Since the destination is determined by the MAC address, this mode works well in the local network configuration. If all traffic is through a single vro (for example, if there is only one gateway in the "Gateway" network configuration, the source and target mac are fixed, then the line calculated by this algorithm will always be the same, so this mode does not make much sense .), This mode is not the best choice.
  • Vswitch support--- Like balance-rr, the switch port must be configured as "port channel ". In this mode, the source and target mac are used as the hash factor for xor algorithm routing.

Fourth:Bond3: fault-tolerance (broadcast)

Standard Document Description

Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.

Features

This mode is characterized by two copies of a packet sent to the two interfaces under bond respectively. When the peer switch fails, we do not feel any downtime, but this method is too resource-consuming; however, this mode has a good fault tolerance mechanism. This model is applicable to the financial industry because they require highly reliable networks and do not allow any problems.

 

Category 5:Bond4: lacp

Standard Document Description

IEEE 802.3ad Dynamic link aggregation. creates aggregation groups that share the same speed and duplex settings. utilizes all slaves in the active aggregator according to the 802.3ad specification. pre-requisites: 1. ethtool support in the base drivers for retrieving. the speed and duplex of each slave. 2. A switch that supports IEEE 802.3ad Dynamic link aggregation. most switches will require some type of configuration to enable 802.3ad mode.

Features

The 802.3ad mode is IEEE standard, so all the peers that implement 802.3ad can have good interoperability. The 802.3ad protocol includes automatic aggregation configuration. Therefore, you only need to manually configure the vswitch (note that only some devices can use 802.3ad ). The 802.3ad standard also requires frames to be transmitted in order (to a certain extent). Therefore, a single connection usually does not see the packet in disorder. 802.3ad also has some disadvantages: The standard requires that all devices be in the same rate and duplex mode for aggregation operations, and the same as other bonding Load Balancing modes except the balance-rr mode, no connection can use the bandwidth of more than one interface.

In addition, the linux bonding 802.3ad achieves traffic distribution through the peer end (through the XOR value of the MAC address), so in the "Gateway" configuration, all Outgoing (Outgoing) the traffic will use the same device. Incoming traffic may also be terminated on the same device, depending on the Balance Policy in the Peer 802.3ad implementation. In the local configuration, the two routes are distributed through the device.

Band4 requires that all ports involved in binding run the 802.3ad protocol. This method is similar to band0 but different. In 802.3ad, LACP automatically notifies the switches of ports to be aggregated. After the 802.3ad aggregation configuration, LACPDU) LACP notifies the switch that the adapter configured in the aggregation should be considered as an adapter on the switch without user intervention. (This should be done by protocol, but no command for separately enabling 802.3ad or LACP is found on the H3C5500-EI switch, and LACP is not enabled on all ports for static aggregation in the aggregation group, therefore, if you operate in band4 mode, we recommend that you manually perform dynamic port aggregation on the vswitch and manually specify global packets to share the load according to the source MAC address and target MAC address .) According to the IEEE 802.3ad specification, packets destined for the same IP address are sent through the same adapter. Therefore, when operating in 802.3ad mode, the information package will always be distributed in standard mode instead of Round Robin mode.

 

 

3. vswitch Configuration

Interface AggregatePort 1 configure the aggregation port interface GigabitEthernet 0/23 port-group 1 mode enable lacp active mode interface GigabitEthernet 0/24 port-group 1 mode active

Prerequisites

Condition 1: ethtool supports obtaining the rate and Duplex Setting of each slave. Condition 2: switch supports IEEE 802.3ad Dynamic link aggregation. Condition 3: most switches (switches) requires specific configuration to support 802.3ad Mode

Sixth:Bond5: transmit load balancing

Standard Document Description

Adaptive transmit load balancing: channel bonding that does not require any special switch support. the outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. incoming traffic is already ed by the current slave. if the processing ing slave fails, another slave takes over the MAC address of the failed locking slave. prerequisite: Ethtool support in the base drivers for retrieving the speed of each slave.

Features

Balance-tlb mode balances outbound traffic through peer balancing. Since it is balanced Based on the MAC address, in "Gateway" configuration (as described above), this mode will send all traffic through a single device. However, in "local" network configuration, this mode balances multiple local network peer ends in a relatively smart way (not in the balance-xor or 802.3ad mode, therefore, the unfortunate MAC addresses of those numbers (such as XOR with the same value) are not clustered on the same interface.

Unlike 802.3ad, interfaces in this mode can have different rates without special switch configurations. The disadvantage is that all incoming (incoming) traffic in this mode will reach the same interface. In this mode, the network device driver of the slave interface must have some ethtool support; ARP monitoring is not available.

 

7:Bond6: adaptive load balancing


Features

This mode includes the balance-tlb mode and the receive load balance (rlb) for IPV4 traffic, without the support of any switch. The received Server Load balancer is implemented through ARP negotiation. The bonding driver intercepts the ARP response sent by the Local Machine and changes the source hardware address to the unique hardware address of a server Load balancer in bond, so that different peer terminals can communicate with each other using different hardware addresses. All ports receive arp REQUEST packets from the peer end. When arp is returned, the bond driver module intercepts the arp reply packets and calculates the corresponding ports according to the algorithm, at this time, the source mac of the arp reply packet and the send source mac are changed to the corresponding port mac. From packet capture analysis, the reply packet is sent from the first slave Port 1 and the second slave Port 2. And so on.

(There is another point: in addition to sending the port reply packet, each port also sends the reply packet from other ports, mac or mac on other ports) in this way, the incoming traffic from the server is balanced.

When the local machine sends an ARP request, the bonding driver copies and saves the peer IP information from the ARP packet. When the ARP response arrives from the peer end, the bonding driver extracts the hardware address and initiates an ARP response to a slave in bond (the algorithm is the same as above, for example, if one port is counted, an arp request is sent. If one reply is sent, the mac address of 1 is used ). One problem with Server Load balancer using ARP negotiation is that the bond hardware address is used every time an ARP request is broadcast. Therefore, after the peer learns the hardware address, all received traffic will flow to the current slave. This problem is solved by sending an update (ARP response) to all the peer end and sending a response to all ports. The response contains their unique hardware address, which leads to a redistribution of traffic. When a new Server Load balancer instance is added to bond, or an inactive server Load balancer instance is re-activated, the received traffic is distributed. The received loads are sequentially distributed (round robin) on the fastest Server Load balancer in bond.

When a link is re-connected, or a new slave is added to bond, the received traffic is re-allocated in all currently activated slave, sends an ARP response to each client by using the specified MAC address. The updelay parameter described below must be set to a value greater than or equal to the forwarding latency of the switch to ensure that the ARP response sent to the peer end is not blocked by the switch.

When the number of machines in the cluster is small, or the machine is cross-VLAN or routed, mode 6 does not work very well. The difference between mod = 6 and mod = 0: mod = 6. First, the eth0 traffic is full, and then the eth1 traffic is occupied ,.... ethX; If mod is set to 0, the traffic between the two ports is stable and the bandwidth is basically the same. Mod = 6, the first port is found to have a high traffic, and 2nd Ports only occupy a small part of the traffic.

Prerequisites

Condition 1: ethtool supports obtaining the speed of each slave;

Condition 2: The underlying driver supports setting the hardware address of a device so that there is always a slave (curr_active_slave) using the bond hardware address, at the same time, ensure that the slave in each bond has a unique hardware address. If curr_active_slave fails, its hardware address will be taken over by the newly selected curr_active_slave.

BondingParameter Introduction

Parameters

Description

Max_bonds

Specify the number of bonding devices created for the bonding driver. For example, if max_bonds is 3 and the bonding driver is not loaded, bond0, bond1, and bond2 will be created. The default value is 1.

Lacp_rate

Specify the speed at which LACPDU packets are transmitted to the peer end in 802.3ad mode. Possible options:

Slow or 0 request peer transmits LACPDU every 30 s

Fast or 1 request peer transmits LACPDU every 1 s

The default value is slow.

Downdelay

Specifies a time, used to wait for a period of time and then disable a slave after link failure is found, in milliseconds (MS ). This option is only valid for miimon monitoring. The downdelay value should be an integer multiple of the miimon value, otherwise it will be rounded to the nearest integer multiple. The default value is 0.

Arp_ip_target

Specify a group of IP addresses for ARP monitoring, which is only valid when arp_interval> 0. These IP addresses are the destination of ARP requests and are used to determine whether the link to the target address works normally. Multiple IP addresses are separated by commas. Specify at least one IP address. You can specify up to 16 IP addresses. The default value is no IP address.

Arp_interval

Specifies the ARP link monitoring frequency in milliseconds (MS ). If APR monitoring works in Ethernet Compatibility mode (Mode 0 and Mode 2), you need to configure the switch to evenly distribute network packets on all links. If the switch is configured to distribute packets in XOR mode, all responses from ARP targets will be received by other devices on the same link, this will cause failure of other devices. ARP monitoring should not be used together with miimon. Setting 0 disables ARP monitoring. The default value is 0.

Miimon

Specifies the MII link monitoring frequency in milliseconds (MS ). This will determine the frequency at which the driver checks the status of each slave link. 0 indicates that MII link monitoring is disabled. 100 can be used as a good initial reference value. The default value is 0.

Mode

Specifies the bonding policy. The default value is balance-rr (round robin, round robin policy ).

Primary

Specify the Server Load balancer instance to which the Server Load balancer Instance becomes the primary device. Set the value to a string, such as eth0 and eth1. As long as the specified device is available, it will always be an active slave. The device is switched only when the primary device is disconnected. This is useful when a slave device is preferred. For example, a slave device has a higher throughput. The primary option is only valid for active-backup mode.

Updelay

Specify the wait time before the link is activated when a link is recovered, in milliseconds. This option is only valid for miimon link listening. Updelay should be an integer multiple of the miimon value. If not, it will be rounded down to the nearest integer. The default value is 0.

Use_carrier

Determine whether miimon needs to use MII, ETHTOOL ioctls, or netif_carrier_ OK () to determine the link status. MII or ETHTOOL ioctls is more inefficient and uses obsolete call sequences in the kernel. netif_carrier_ OK () depends on the device driver to maintain the status (identify the carrier). In this article, most, but not all, device drivers support this feature. If bonding always considers the link to be accessible but actually disconnected, it may be because your network device driver does not support netif_carrier_on/off. Because the default status of netif_carrier is "carrier on", if the driver does not support netif_carrier, the link is always normal. In this case, set use_carrier to 0 so that bonding can use MII/ETHTOOL ictl to determine the link status. If this option is set to 1, netif_carrier_ OK () is used. If it is set to 0, the discarded MII/ETHTOOL ioctls is used. The default value is 1.

Xmit_hash_policy

Select different hash modes in balance-xor and 802.3ad modes for slave election. Possible values: layer2 uses the XOR of the hardware MAC address to generate a hash. Formula: (source MAC address XOR destination MAC address) % Number of slave this algorithm will allocate all the traffic on a network peer to the same slave. Layer3 + 4 when possible, this policy uses the information of the Upper-layer protocol to generate a hash. This will allow traffic from a specific network peer to be distributed across multiple slave instances, even though the same connection is not distributed across multiple slave instances. Calculation Formula for unsharded tcp and udp Packets: (source port XOR destination port) XOR (Source ip xor destination IP address) AND 0 xFFFF) % Server Load balancer ignores the source port and destination port information of fragmented TCP or UDP packets and other IP packets. It uses the same hash policy as layer2 for non-IP traffic. This policy is intended to mimic the behavior of certain switches, such as Cisco switches with PFC2, and some Foundry and IBM products. This algorithm is not fully adapted to 802.3ad. A single TCP or UDP session contains both fragmented and unsharded packets, which will cause packets to be transmitted on two interfaces, this will lead to disordered delivery. Most of the traffic does not meet this condition, just as TCP has few fragments, and most of the UDP traffic does not exist in long-term sessions. Other 802.3ad implementations may not tolerate such incompatibility. The default value is layer2. This option is added to bonding 2.6.3. in earlier versions, this parameter does not exist, but is only a layer2 policy.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.