Linux System and Performance Monitoring (Network)

Source: Internet
Author: User
Tags configuration settings

Linux System and Performance Monitoring (Network)
Date: 2009.07.21
Author: Darren Hoch
Translation: Tonnyom [AT] hotmail.com

Followed by the first three articles:
Linux System and Performance Monitoring (CPU)
Linux System and Performance Monitoring (Memory)
Linux System and Performance Monitoring (I/O)

8.0 Network monitoring introduction

Network is the most difficult among all Subsystem monitoring. this is mainly because the network concept is abstract. when monitoring the network performance on the system, there are too many factors. these factors include latency, conflict, congestion, and packet loss.

This topic describes how to check the performance of Ethernet, IP, and TCP.

8.1 Ethernet Configuration Settings)

Unless explicitly specified, almost all NICs are adaptive network speeds. When a network has many different network devices, they adopt different rates and working modes.

Most commercial networks run at 100 or 1000BaseTX. You can use ethtool to determine the rate at which the system is running.

In the following example, a system with a 100 BaseTX Nic is automatically negotiated to adapt to 10 BaseTX.

# Ethtool eth0
Settings for eth0:
Supported ports: [tp mii]
Supported link modes: 10 baseT/Half 10 baseT/Full
100 baseT/Half 100 baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10 baseT/Half 10 baseT/Full
100 baseT/Half 100 baseT/Full
Advertised auto-negotiation: Yes
Speed: 10 Mb/s
Duplex: Half
Port: MII
PHYAD: 32
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: d
Current message level: 0x00000007 (7)
Link detected: yes

The following example shows how to force the NIC speed to be adjusted to 100 BaseTX:

# Ethtool-s eth0 speed 100 duplex full autoneg off

# Ethtool eth0
Settings for eth0:
Supported ports: [tp mii]
Supported link modes: 10 baseT/Half 10 baseT/Full
100 baseT/Half 100 baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10 baseT/Half 10 baseT/Full
100 baseT/Half 100 baseT/Full
Advertised auto-negotiation: No
Speed: 100 Mb/s
Duplex: Full
Port: MII
PHYAD: 32
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: pumbg
Wake-on: d
Current message level: 0x00000007 (7)
Link detected: yes

8.2 Monitoring Network Throughput)

The synchronization between interfaces does not mean that there is only a bandwidth problem. it is important to manage and optimize the switches, network cables, or routers between the two hosts. the best way to test network throughput is to send data between the two systems and collect statistics, such as latency and speed.

8.2.0 use iptraf to view local Throughput

The iptraf tool (http://iptraf.seul.org) provides a dashboard for each Nic throughput.

# Iptraf-d eth0
Figure 1: Monitoring for Network Throughput

The output shows that the transmission rate of the system is 61 mbps, which is a little slow for the 100 mbps network.

8.2.1 use netperf to view the terminal Throughput

Unlike iptraf's passive local traffic monitoring, netperf allows administrators to perform more controllable throughput monitoring. it is very helpful to determine the amount of throughput between client workstations and a high-load server (such as file or web server. the netperf tool runs in client/server mode.

To complete a basic controllable throughput test, the netperf server must first run on the server-side system:

Server # netserver
Starting netserver at port 12865
Starting netserver at hostname 0.0.0.0 port 12865 and family AF_UNSPEC

The netperf tool may require multiple sampling. most basic tests are a standard throughput test. in the following example, a 30-second TCP throughput sample is executed from the client in a LAN environment:

The output shows that the network throughput is about 89 mbps. The server (192.168.1.215) and client are in the same LAN. This is very good for 100 mbps networks.

Client # netperf-H 192.168.1.215-l 30
Tcp stream test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
192.168.1.230 (192.168.1.230) port 0 AF_INET
Recv Send
Socket Message Elapsed
Size Time Throughput
Bytes secs. 10 ^ 6 bits/sec

87380 16384 16384 30.02 89.46

Switching from a LAN to a Wireless network router with a 54G (Note: Wireless-G is the future 54Mbps Wireless network standard), and testing within the range of 10 feet. the throughput drops sharply. when the maximum throughput is 54 MBits, the total throughput of a laptop is 14 MBits.

Client # netperf-H 192.168.1.215-l 30
Tcp stream test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
192.168.1.215 (192.168.1.215) port 0 AF_INET
Recv Send
Socket Message Elapsed
Size Time Throughput
Bytes secs. 10 ^ 6 bits/sec

87380 16384 16384 30.10 14.09

If it is within the range of 50 feet, it will be further reduced to 5 MBits.

# Netperf-H 192.168.1.215-l 30
Tcp stream test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
192.168.1.215 (192.168.1.215) port 0 AF_INET
Recv Send
Socket Message Elapsed
Size Time Throughput
Bytes secs. 10 ^ 6 bits/sec

87380 16384 16384 30.64 5.05

If you switch from the LAN to the Internet, the throughput drops to 1 Mbits.

# Netperf-H litemail.org-p 1500-l 30
Tcp stream test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
Litemail.org (72.249.104.14 8) port 0 AF_INET
Recv Send
Socket Message Elapsed
Size Time Throughput
Bytes secs. 10 ^ 6 bits/sec

87380 16384 16384 31.58 0.93

The last is a VPN connection environment, which is the worst throughput in all network environments.

# Netperf-H 10.0.1.129-l 30
Tcp stream test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
10.0.1.129 (10.0.1.129) port 0 AF_INET
Recv Send
Socket Message Elapsed
Size Time Throughput
Bytes secs. 10 ^ 6 bits/sec

87380 16384 16384 31.99 0.51

In addition, netperf can help test the total number of TCP requests and responses per second. establish a single TCP connection and send multiple requests/responses in sequence (the ack packet is in the size of 1 byte ). it is a bit similar to an RDBMS program that is executing multiple transactions or the mail server sends emails in the same connection pipeline.

The following example simulates TCP request/response within 30 seconds:

Client # netperf-t TCP_RR-H 192.168.1.230-l 30
Tcp request/response test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
To 192.168.1.230 (192.168.1.230) port 0 AF_INET
Local/Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Time Rate
Bytes Bytes bytes secs. per sec

16384 87380 1 1 30.00 4453.80
16384 87380

The output shows that the processing speed supported by this network is 4453 psh/ack per second (the packet size is 1 byte ). this is actually ideal, because in actual situations, most requests, especially responses, are larger than 1 byte.

In reality, netperf generally uses the requests size of 2 K by default, and responses uses the 32 K size by default:

Client # netperf-t TCP_RR-H 192.168.1.230-l 30-r 2048,32768
Tcp request/response test from 0.0.0.0 (0.0.0.0) port 0 AF_INET
192.168.1.230 (192.168.1.230) port 0 AF_INET
Local/Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Time Rate
Bytes Bytes bytes secs. per sec

16384 87380 2048 32768 30.00 222.37
16384 87380

This processing speed is reduced to 222 per second.

8.2.2 using iperf to evaluate network efficiency

When the connection needs to be checked on both ends, iperf and netperf are very similar. the difference is that iperf checks TCP/UDP efficiency more deeply through windows size and QOS devices. this tool is tailored to administrators who need to optimize TCP/IP stacks and test these stacks efficiency.

As a binary program, iperf can run in either server or client mode. Port 50001 is used by default.

First start the server (192.168.1.215 ):

Server # iperf-s-D
Running Iperf Server as a daemon
The Iperf daemon process ID: 3655
--------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
--------------------

In the following example, in a wireless network environment, the client repeatedly runs iperf to test the network throughput. this environment is assumed to be fully utilized, and many hosts are downloading the ISO images file.

Connect the client to the server (192.168.1.215), and perform a bandwidth test sample every five seconds within a total of 60 seconds.

Client # iperf-c 192.168.1.215-t 60-I 5
--------------------
Client connecting to 192.168.1.215, TCP port 5001
TCP window size: 25.6 KByte (default)
--------------------
[3] local 192.168.224.150 port 51978 connected
192.168.1.215 port 5001
[ID] Interval Transfer Bandwidth
[3] 0.0-5.0 sec 6.22 MBytes 10.4 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 5.0-10.0 sec 6.05 MBytes 10.1 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 10.0-15.0 sec 5.55 MBytes 9.32 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 15.0-20.0 sec 5.19 MBytes 8.70 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 20.0-25.0 sec 4.95 MBytes 8.30 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 25.0-30.0 sec 5.21 MBytes 8.74 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 30.0-35.0 sec 2.55 MBytes 4.29 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 35.0-40.0 sec 5.87 MBytes 9.84 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 40.0-45.0 sec 5.69 MBytes 9.54 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 45.0-50.0 sec 5.64 MBytes 9.46 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 50.0-55.0 sec 4.55 MBytes 7.64 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 55.0-60.0 sec 4.47 MBytes 7.50 Mbits/sec
[ID] Interval Transfer Bandwidth
[3] 0.0-60.0 sec 61.9 MBytes 8.66 Mbits/sec

Other network transmission of this host will also affect the bandwidth sampling of this part. Therefore, we can see that in the total 60 seconds, the bandwidth is between 4 and 10 MBits.

In addition to TCP testing, iperf UDP testing mainly evaluates packet loss and jitter.

The next iperf test is in the same 54 Mbit G standard wireless network. In early examples, the current throughput is only 9 Mbits.

# Iperf-c 192.168.1.215-B 10 M
WARNING: option-B implies udp testing
--------------------
Client connecting to 192.168.1.215, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 107 KByte (default)
--------------------
[3] local 192.168.224.150 port 33589 connected with 192.168.1.215 port 5001
[ID] Interval Transfer Bandwidth
[3] 0.0-10.0 sec 11.8 MBytes 9.90 Mbits/sec
[3] Sent 8420 datagrams
[3] Server Report:
[ID] Interval Transfer Bandwidth Jitter Lost/Total Bytes rams
[3] 0.0-10.0 sec 6.50 MBytes 5.45 Mbits/sec 0.480 MS 3784/8419 (45%)
[3] 0.0-10.0 sec 1 limit rams received ed out-of-order

From the output, we can see that when trying to transmit 10 M of data, only 5.45M. is actually generated, but 45% of the packets are lost.

8.3 Individual Connections with tcptrace

The tcptrace tool provides detailed TCP-related information for a specific connection. this tool uses libcap to analyze a specific TCP sessions. the information reported by this tool is sometimes hard to be found in a TCP stream. this information

Including:

1. TCP Retransmissions
2. TCP Windows Sizes-slow connection speed is related to small windows sizes
3, Total throughput of the connection-connection throughput
4. Connection duration-Connection duration

8.3.1 case study-use tcptrace

The tcptrace tool may already have an installation package in some Linux releases, the author of this article through the website, download is the source code installation package: http://dag.wieers.com/rpm/packages/tcptrace. tcptrace requires libcap based on file input. if tcptrace has no options, each unique connection process is captured by default.

The following example uses libcap to set the input file to bigstuff:

# Tcptrace bigstuff
1 arg remaining, starting with 'bigexist'
Ostermann's tcptrace-version 6.6.7-Thu Nov 4, 2004

146108 packets seen, 145992 TCP packets traced
Elapsed wallclock time: 0:00:01. 634065,894 13 pkts/sec analyzed
Trace file elapsed time: 0:09:20. 358860
TCP connection info:
1: 192.168.1.60: pcanywherestat-192.168.1.102: 2571 (a2b) 404> 450 <2: 192.168.1.60: 3356-ftp.strongmail.net: 21 (c2d) 35> 21 <3: 192.168.1.60: 3825-ftp.strongmail.net: 65023 (e2f) 5> 4 <(complete) 4: 192.168.1.102: 1339-205.188.194: 5190 (g2h) 6> 6 <5: 192.168.1.102: 1490-cs127.msg.mud.yahoo.com: 5050 (i2j) 5> 5 <6: py-in-f111.google.com: 993-192.168.1.102: 3785 (k2l) 13> 14 <in the output above, each connection has a corresponding source host and destination host. tcptrace uses the-l and-o options to view more detailed data of a connection. the following result is the #16 connection statistics in the bigstuff file: # tcptrace-l-o1 bigstuff 1 arg remaining, starting with 'bigstuff 'ostermann's tcptrace-version 6.6.7-Thu Nov 4, 2004 146108 packets seen, 145992 TCP packets traced elapsed wallclock time: 0:00:00. 529361,276 008 pkts/sec analyzed trace file elapsed time: 0:09:20. 358860 TCP connection info: 32 TCP connections traced: TCP connection 1: host a: 192.168.1.60: pcanywherestat host B: 192.168.1.102: 2571 complete conn: no (SYNs: 0) (FINs: 0) first packet: Sun Jul 20 15:58:05. 472983 2008 last packet: Sun Jul 20 16:00:04. 564716 2008 elapsed time: 0:01:59. 091733 total packets: 854 filename: bigstuff a-> B: B->:
Total packets: 404 total packets: 450
Ack pkts sent: 404 ack pkts sent: 450
Pure acks sent: 13 pure acks sent: 320.
Sack pkts sent: 0 sack pkts sent: 0
Dsack pkts sent: 0 dsack pkts sent: 0
Max sack blks/ack: 0 max sack blks/ack: 0
Unique bytes sent: 52608 unique bytes sent: 10624
Actual data pkts: 391 actual data pkts: 130
Actual data bytes: 52608 actual data bytes: 10624
Rexmt data pkts: 0 rexmt data pkts: 0
Rexmt data bytes: 0 rexmt data bytes: 0
Zwnd probe pkts: 0 zwnd probe pkts: 0
Zwnd probe bytes: 0 zwnd probe bytes: 0
Outoforder pkts: 0 outoforder pkts: 0
Pushed data pkts: 391 pushed data pkts: 130
SYN/FIN pkts sent: 0/0 SYN/FIN pkts sent: 0/0
Urgent data pkts: 0 pkts urgent data pkts: 0 pkts
Urgent data bytes: 0 bytes urgent data bytes: 0 bytes
Mss requested: 0 bytes mss requested: 0 bytes
Max segm size: 560 bytes max segm size: 176 bytes
Min segm size: 48 bytes min segm size: 80 bytes
Avg segm size: 134 bytes avg segm size: 81 bytes
Max win adv: 19584 bytes max win adv: 65535 bytes
Min win adv: 19584 bytes min win adv: 64287 bytes
Zero win adv: 0 times zero win adv: 0 times
Avg win adv: 19584 bytes avg win adv: 64949 bytes
Initial window: 160 bytes initial window: 0 bytes
Initial window: 2 pkts initial window: 0 pkts
Ttl stream length: NA
Missed data: NA
Truncated data: 36186 bytes truncated data: 5164 bytes
Truncated packets: 391 pkts truncated packets: 130 pkts
Data xmit time: 119.092 secs data xmit time: 116.954 secs
Idletime max: 441267.1 MS idletime max: 441506.3 MS
Throughput: 442 Bps throughput: 89 Bps

8.3.2 case study-computing broadcast Rate

It is almost impossible to determine which connection has a serious broadcasting problem. It only needs to be analyzed. The tcptrace tool can be used to identify the problematic connection through the filtering mechanism and Boolean expression. there are many connections in a very busy network, and almost all connections will be broadcast. find the most among them, which is the key to the problem.

In the following example, tcptrace will find connections that broadcast more than 100 segments:

# Tcptrace-f 'rexmit _ segs> 100 'bigstuff
Output filter: (c_rexmit_segs> 100) OR (s_rexmit_segs> 100 ))
1 arg remaining, starting with 'bigexist'
Ostermann's tcptrace-version 6.6.7-Thu Nov 4, 2004

146108 packets seen, 145992 TCP packets traced
Elapsed wallclock time: 0:00:00. 687788,212 431 pkts/sec analyzed
Trace file elapsed time: 0:09:20. 358860
TCP connection info:
16: ftp.strongmail.net: 65014-192.168.1.60: 2158 (ae2af) 18695> 9817 <in this output, the connection #16 exceeded 100. run the following command to view other information about the connection: # tcptrace-l-o16 bigstuff arg remaining, starting with 'bigstuff 'Ostermann's tcptrace-version 6.6.7-Thu Nov 4, 2004 146108 packets seen, 145992 TCP packets traced elapsed wallclock time: 0:00:01. 355964,107 pkts/sec analyzed trace file elapsed time: 0:09:20. 358860 TCP connection info: 32 TCP connections traced: ================================== TCP connection 16: host AE: ftp.strongmail.net: 65014 host af: 192.168.1.60: 2158 complete conn: no (SYNs: 0) (FINs: 1) first packet: Sun Jul 20 16:04:33. 257606 2008 last packet: Sun Jul 20 16:07:22. 317987 2008 elapsed time: 0:02:49. 060381 total packets: 28512 filename: bigstuff AE-> af: af-> AE:

Unique bytes sent: 25534744 unique bytes sent: 0
Actual data pkts: 18695 actual data pkts: 0
Actual data bytes: 25556632 actual data bytes: 0
Rexmt data pkts: 1605 rexmt data pkts: 0
Rexmt data bytes: 2188780 rexmt data bytes: 0

Calculate the broadcast rate:
Rexmt/actual * 100 = Retransmission rate

1605/18695*100 = 8.5%

The reason for this slow connection is that it has a broadcast rate of 8.5%.

8.3.3 case study-computing broadcast time

The tcptrace tool has a series of modules that display different data according to attributes, including protocol, port, and time. the Slice module allows you to observe the TCP Performance over a period of time. you can view other performance data during a series of forwarding to identify the bottleneck.

The following example demonstrates how tcptrace uses the slice mode:

# Tcptrace-submit ice bigfile

The preceding command creates an slice. dat file in the current working directory. The file content contains information about the broadcast every 15 seconds:

# Ls-l slice. dat
-Rw-r-1 root 3430 Jul 10 slice. dat
# More slice. dat
Date segs bytes rexsegs rexbytes new active
-----------------------
22:19:41. 913288 46 5672 0 0 1 1
22:19:56. 913288 131 25688 0 0 0 1
22:20:11. 913288 0 0 0 0 0 0
22:20:26. 913288 5975 4871128 0 0 0 1
22:20:41. 913288 31049 25307256 0 0 0 1
22:20:56. 913288 23077 19123956 40 59452 0 1
22:21:11. 913288 26357 21624373 5 7500 0 1
22:21:26. 913288 20975 17248491 3 4500 12 13
22:21:41. 913288 24234 19849503 10 15000 3 5
22:21:56. 913288 27090 22269230 36 53999 0 2
22:22:11. 913288 22295 18315923 12856 0 2
22:22:26. 913288 8858 7304603 3 4500 0 1

8.4 conclusion

The monitoring network performance consists of the following parts:

1. check and confirm that all network cards are working at the correct rate.
2. Check the throughput of each Nic and confirm the network speed when it is in service.
3. Monitor the network traffic type and determine the appropriate traffic priority policy.

Author "Never give up"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.