With the increase in network size, complexity and traffic, the need for continuous and precise monitoring is greater than ever. Continuous monitoring is an important part of detecting security issues, misconfiguration, equipment failures, and the execution of traffic engineering.
At the highest level, it is a push-based monitoring approach: Data plane devices, such as switches and routers, stream data about traffic and performance to the software that performs the analysis. Network telemetry is becoming a powerful way to support these needs.
Today's telemetry system forces users to choose between granularity and coverage. The packet-level system streams each packet (or header) to the software. This provides fine-grained visibility, but the cost of processing each packet in the software makes high coverage impractical. A stream-level system, such as NetFlow, aggregates packets into each stream record before the packet arrives at the analysis software. This greatly reduces workload and makes high coverage more practical, but at the expense of packet-level visibility.
Next-generation telemetry systems can leverage the advances in switch and server hardware to achieve a better balance between granularity and performance. In the switch, the data plane can be reconfigured to support custom packet processing at the line rate of 1 billion packets per second. In the server, high-bandwidth memory and instruction-level parallelism can be calculated trillions of times per second.
? Although all two platforms are powerful, they are not good at everything needed to telemetry and analyze the system. Therefore, determining the role each platform should play is critical. Most network analysis tasks can be divided into three different phases:
1. Select the phase to extract the packet characteristics from the data path.
2. The grouping stage groups features by a subset of the title space.
3. Finally, the aggregation phase calculates the statistics for each group of packets.
The aggregation phase is more appropriate for the server because it is application-specific and may require overly complex calculations of the switch hardware. The select and group stages are more suitable for switch data plane hardware, which can directly access the packet header and perform basic grouping operations with low latency on-chip storage. (Welcome reprint share)
Packet-level network telemetry and network security push analysis