I. Introduction
Because the Internet uses the TCP traffic control mechanism, data transmission out of order will have a great impact on network performance. Although the hash function and other methods can ensure the Sequential transmission of TCP packets, the performance of the parallel exchange system can only be ensured statistically, and the network traffic distribution is highly dependent. Therefore, a better method is to adopt a parallel processing mechanism of fixed-length cells, which is not only easy to implement but also absolutely guaranteed in a systematic manner and independent of network traffic distribution. However, a high-speed Router processes an IP group with an indefinite length and implements High-Speed IP packet exchange based on a fixed-length message-element parallel exchange system, IP groups must be divided into fixed-length cell streams at line rate. At the same time, each cell must contain corresponding control information, to ensure the correct exchange of the cell stream and the reorganization of the destination port IP packets. This article analyzes the functional and performance requirements of each processing module in detail according to the IP grouping process of the high-speed router, A high-speed Router Architecture Combining IP group parallel processing and cell parallel processing is proposed.
Ii. Router Packet Processing
To complete IP group switching, A vro generally processes packet header processing, route lookup, traffic control, caching, queue scheduling, switching, and output caching.
1. Input Message Processing
The router input packet processing process mainly includes packet header processing, route query table, traffic control, cache, queue scheduling, and so on. The Header Processing Unit processes IP packet header fields, including Header verification, TTL field processing, error check, header option recognition, and packet classification. IP packets that cannot be directly forwarded are placed in a separate queue and processed by the system processor. IP packets to be forwarded are extracted from the TOS Domain According to the system configuration, maps to the first level of newspaper culture. At the same time, generate the data structure required by the route search module and send it to the route search module for Route Search. The Route Search Unit searches for the corresponding output port based on the forwarding table maintained by the system and the destination IP address in the message. Because the queue management and traffic control of IP packets depend on the packet flow and QOS requirements, the router must complete the table query at the line rate to ensure that packets are not queued before the table query. According to the current technology level, the query engine with large-scale ASIC implementation can achieve an average processing capability of 40 mpps. Currently, the query table technology can achieve a processing capability of 20 mpps ). On average, using these hardware lookup techniques can ensure the processing requirements of 40 Gbps and 20 Gbps line rate interfaces. Note that the aforementioned forwarding engine performance tests are conducted without considering forwarding table updates, while forwarding table updates occupy a large amount of memory access bandwidth, this affects the actual packet forwarding rate. These lookup techniques not only occupy a large amount of memory for the forwarding table, but also make it inefficient to update the forwarding table because of its complicated data structure. According to relevant literature, due to the instability of Internet routes, the peak rate of route table update on the trunk router can reach several hundred times per second. Therefore, how to increase the speed of forwarding table updates while ensuring the speed of table queries remains to be further studied.
The queue management and congestion control unit sends messages to the queue buffer zone based on the buffer status and queue status information, IP packet classification and routing query results. In the case of congestion, the Traffic Control Unit implements congestion control based on the congestion control policy. To ensure ip qos Based on the differentiated service architecture, packet discarding controls are implemented based on different service types and flows. Generally, buffer zone management uses fixed-length data blocks as the unit to apply for and release the buffer zone. This not only facilitates management, but also facilitates hardware implementation and avoids a large number of "Broken blocks" in the buffer zone ", in addition, it is easy to implement the interface of the exchange network with a fixed-length data block as the transmission unit, and realize the fair transfer of IP packets. Because the IP packet is an indefinite packet, the packet must be divided into fixed-length cells before it enters the buffer zone. To ensure that each cell can be scheduled and exchanged independently and re-assembled into a complete IP packet in the output buffer, each cell must carry the corresponding destination port and control information.
The queue scheduling Unit sends the cells in the corresponding queue to the destination port through the switching network according to the arbitration structure and scheduling policy of the switching network controller. The switching network is used to send packets from the source port to the destination port. The switching network for high-speed routers usually adopts the mode-shared cache switching or Switching Matrix Crossbar. With its excellent scalability, the switching matrix method is increasingly used in high-speed Router products.
2. Output Message Processing
The output end mainly implements message restructuring, buffering, and output Link Interface Control.
First, a fixed-length cell is obtained from the switching network through the switching network interface, and it is re-assembled into a complete IP packet. Because the switching network controller controls data transmission on each source port based on the transmission unit, the output end will receive the cell streams on each source port at the same time. Therefore, logically, you must set an independent buffer for each source port, cache the cells of each IP packet separately, receive the entire IP packet, and send the complete IP group to the output buffer for sending.
For the switching architecture that uses the input buffer virtual output queue VOQ), the transmission rate of the switching network is equal to the link rate. In this way, the output buffer can adopt a simple FIFO management mechanism. The buffer size is N times the maximum packet length, and N is the number of router ports ). However, the input buffer exchange cannot guarantee the due diligence of the output link. the VOQ mechanism can only ensure that the input traffic reaches 100% of the throughput even when the input traffic is evenly distributed. Studies show that the Internet traffic distribution is very uneven. Therefore, the acceleration of input buffer switching has become a hot topic in exchange technology research, which aims to make the input buffer switching strictly mimic the output buffer switching. Therefore, the switching network speed must be twice the link speed. However, the acceleration of the switching network will cause the switching network output rate to exceed the output link rate. The output packets must be requeued in the output buffer, and even cause the output buffer overflow. In this way, the buffer management, queue scheduling, and congestion control mechanisms must be set at the output end. In addition, the implementation of system congestion control policies is not only related to the congestion control mechanism at the input end, but also to the congestion control mechanism at the output end. Whether it is an input buffer exchange or an output buffer exchange, queue management and congestion control only need to be implemented on one side of the switching network, and the accelerated input buffer exchange, messages must be cached at the input and output ends. Some documents call it a combination input/output exchange CIOQ ). There is no ideal solution for queue scheduling and congestion control in this architecture.
A simple solution is to use the common anti-pressure mechanism in ATM switches. Set a small buffer at the output end. Adopt a simple FIFO scheduling policy. When the output buffer overflows, use the back pressure mechanism to control the switching network to suspend data transmission. The performance of this solution needs to be further analyzed.
As the output packet processing is relatively simple, at the current technical level, a single processing module can be used for 2g interface speed. However, for interfaces larger than 10 Gb, the multi-module parallel processing architecture must be used because of the memory access speed and bus bandwidth restrictions.