Analysis on BGP Network Performance Optimization

Source: Internet
Author: User

As the customer's network scale and coverage grow, the use of BGP routing protocols is no longer the privilege of the operator's network, and more enterprise networks are also deploying BGP. BGP can support a larger network, a wide range of routing policies, and a large number of route table capacity. However, the BGP performance bottleneck problem arises:

1. BGP is deployed at the application layer, which occupies more system resources than the underlying protocol. Therefore, it is necessary to ensure route stability, control the number of routes, and provide better Routing hardware equipment;

2. The BGP Route attributes are complex. Each route consumes more bytes than the IGP protocol. Therefore, it is necessary to improve the transmission efficiency of packets and prevent packet fragmentation in the network;

3. BGP routes come from different IGP or even different autonomous systems. The number of routes is large and the source is complex. Any route fluctuates, it may affect the stability of the entire network device and the speed of Route convergence. Therefore, it is necessary to limit the range of Route fluctuations to prevent the entire system from moving;

4. BGP's route convergence speed is slow. The hold time of BGP is 180 s by default. To speed up BGP Convergence, the bottom layer must provide a fast detection mechanism.

The following describes how to optimize BGP network performance in terms of packet transmission, route update, and fast detection. The specific optimization measures are as follows.

1 BGP neighbor PMTU Detection

The BGP protocol runs on TCP, so the TCP parameter settings will affect the performance of BGP. When the number of routes is small, the adjustment of TCP Parameters may not affect the BGP performance. However, when the number of routes is large, the adjustment of TCP Parameters can significantly optimize the performance. The following describes the specific optimization methods.

First, let's take a look at how BGP packets are sent, as shown in 1.

Figure 1 BGP and TCP data transmission formats

BGP first needs to tell TCP the data to be sent, and then TCP segments based on the data length. The segment size is determined by the size of the MSS value negotiated by TCP, each TCP segment corresponds to a outgoing IP packet. Therefore, the setting of MSS parameters plays a key role in the performance of BGP data transmission. If the setting is too large, it may cause IP layer fragmentation of a device in the middle, the transmission of BGP packets is actually an end-to-end transmission process. If data is split, it must be combined and restored. This will put a certain burden on the recipient's CPU, the packet group process reduces the processing efficiency. If the MSS value is set too small, the network's effective utilization will be very low. The sender and acceptor process packets that can be processed at one time multiple times, reduced efficiency.

The BGP neighbor PMTU detection function can solve the problem mentioned above. Before establishing a BGP neighbor, the router will automatically send a PMTU report to detect the maximum MTU on the path, the TCP protocol can set the MSS size based on the maximum MTU value to achieve optimal network transmission performance.

2. BGP Route update Timer

The RFC4271 of BGP defines the timer for BGP Route update. This timer can only work for routes with the same prefix in the same address family, its main function is to prevent a route in the network from fluctuating too frequently, and it is also a protection for the CPU of the routing device.

Figure 2 route update timer description

As shown in 2, the red and blue arrows represent routes with the same prefix, but they are learned from different neighbors, and the Blue routes are better than the red ones. Assume that the update timer time configured on RA is 30 seconds. The RB route convergence process is as follows:

Ø RA sends a Red Route to RB immediately after receiving it, and starts the update timer (30 seconds) on RA );

Ø after 10 seconds, RA receives a better Blue Route. Because the timer does not time out, it is not sent to RB temporarily, but the local route table is updated, and route convergence is completed in 10th seconds;

The update timer times out on RA in 30th seconds. Therefore, the blue route is sent to RB and the red route is updated. RB converges in 30th seconds.

From the above analysis, we can see that the RB convergence time is about 20 seconds slower than RA. Because BGP is a distance from the vector routing protocol, this delay may affect the BGP router of the entire network, therefore, you need to choose between network stability and route convergence speed. When the performance of the device permits and the overall route is stable, you can appropriately reduce the value of the route update timer. The minimum value is 5 s.

BGP also has a routing attenuation mechanism Dampening, which is used to punish frequently fluctuating routes. If the frequency of a BGP Route fluctuation exceeds the set threshold, the route will be blocked until the route reaches the steady state, therefore, for large networks, you can combine the route update timer and route attenuation to achieve the best combination of Route convergence speed and route stability.

3 linkage with the BFD Protocol

The functions and settings described above can only make the convergence time of routes reach the second level. For a carrier (SP) network, it is often necessary to quickly perceive changes in the routing or BGP neighbor status. However, the perception of the IBGP neighbor status is usually caused by the neighbor's non-direct connection. You need to rely on IGP convergence or the KEEPALIVE message of the BGP itself to perceive the neighbor's status, which may take up to 180 seconds, it is intolerable for operators.

BFD (Bidirectional Forwarding Detection, Bidirectional Forwarding Detection) is an independent high-speed "Hello" protocol, and its working mechanism is similar to the slow "hello" of the routing protocol. A system's BFD can establish a peering relationship with adjacent systems, and then each system monitors the BFD packets from the peer system at a negotiated rate. The monitoring rate can be set incrementally in milliseconds. When the peer system does not receive a preset number of BFD data packets, it considers that the software or hardware infrastructure protected by BFD is faulty and notifies the upper-layer routing protocol, it has achieved the goal of Fast Routing Switching and convergence. (Note: BFD currently has two versions: VER 0 and VER 1, and the two versions are not compatible with each other .)

Leveraging the fast detection feature of this BFD, the BGP and BFD are configured to link. Once a BGP neighbor is established, the BFD is automatically associated with the BGP neighbor relationship and periodically sends detection packets, this cycle time is generally dozens of milliseconds. When no probe packet is received more than five times, BFD will notify BGP to disconnect the neighbor relationship, so that route convergence can be quickly completed.

4 Conclusion

BGP is a very powerful routing protocol, shouldering the routing exchange responsibilities of large enterprise networks and even the entire Internet. Therefore, the processing efficiency and convergence speed of BGP protocol are crucial, it is related to the stability and performance of the core network. The BGP optimization methods mentioned in this Article improve the performance and stability of BGP networks at different levels, including network transmission efficiency, route stability, and fast route convergence, of course, you can also optimize BGP networks in an all-round way by combining route aggregation, routing policies, and network structures. In fact, there are far more than these optimization methods. With the continuous development of network technology, more BGP-related optimization measures will inevitably emerge in the future.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.