Go Linux System Troubleshooting 4--Network Chapter

Source: Internet
Author: User
Tags memcached domain name server iptables network troubleshooting nslookup nslookup command traceroute command



Original: http://www.cnblogs.com/Security-Darren/p/4700387.html












A network failure to troubleshoot a Linux system.



Network troubleshooting is generally a certain way of thinking and order, in fact, the idea of troubleshooting is based on the specific problem-by-paragraph troubleshooting may occur, the final determination of the problem.



So the first thing to ask is, what is the network problem, is not through, or slow?



1. If the network is not through, to locate specific problems, is generally trying to eliminate the impossible failure of the place, and ultimately locate the root cause of the problem. General needs to see



Whether to access the link



Whether the appropriate network adapter is enabled



Whether the local network is connected



DNS failure



Can I route to the target host



Whether the remote port is open



2. If the network speed is slow, there are several ways to locate the source of the problem:



Whether DNS is the source of the problem



To see which nodes are bottlenecks in the routing process



To view bandwidth usage






One, the network does not pass



In general, when there is a network failure, access to the end and the information in the end is collected, the purpose is to determine the host or section of the problem. If a cannot access C and B can access C, then it is obvious that the problem is on a or a to C network, and through the same subnet, several machines A, B can access the network normally, but can not access C, then the network may be a problem with C, or C has problems.



Locating the host where the problem resides, there are generally steps to gradually narrow down the problem and ultimately locate the problem:



1. Whether the link is connected



That is, check whether the network card and networks are physically connected, cable is plugged in and the connection is available, many times not immediately to the computer room to determine the physical connection, you can use the command:


# ethtool ethN


EHTN is a network card that is connected to the failed



Example 1: Viewing the physical connection of a eth0 using Ethtool


 1 # ethtool eth0
 2 Settings for eth0:
 3         Supported ports: [ TP ]
 4         Supported link modes:   10baseT/Half 10baseT/Full
 5                                 100baseT/Half 100baseT/Full
 6                                 1000baseT/Full
 7         Supported pause frame use: No
 8         Supports auto-negotiation: Yes
 9         Advertised link modes:  10baseT/Half 10baseT/Full
10                                 100baseT/Half 100baseT/Full
11                                 1000baseT/Full
12         Advertised pause frame use: No
13         Advertised auto-negotiation: Yes
14         Speed: 1000Mb/s
15         Duplex: Full
16         Port: Twisted Pair
17         PHYAD: 1
18         Transceiver: internal
19         Auto-negotiation: on
20         MDI-X: Unknown
21         Supports Wake-on: g
22         Wake-on: g
23         Link detected: yes





Where 14 lines show the speed of the current network card, this is a gigabit network card, 15 lines show that the current network support full duplex, 23 lines shows that the current network card and the physical connection of the networks is normal. Usually the speed and full/half duplex status is automatically negotiated between the host and the network protocol provider, such as the auto-negotiation on line 8th here. If you find that the duplex of 15 rows is set to half, you can manually change it to full-duplex network:


1 # ethtool-s eth0 Autoneg off Duplex full





2. The NIC is enabled properly



The general network physical connection failure situation is not uncommon, when troubleshooting physical connection problems, you need to further check the network card working status.



Example 2: Check the NIC eth1 status using the Ifconfig command


1 # ifconfig eth1
2 eth1      Link encap:Ethernet  HWaddr e4:1f:13:b5:b0:62  
3           inet addr:10.0.0.11  Bcast:10.0.0.255  Mask:255.255.255.0
4           inet6 addr: fe80::e61f:13ff:feb5:b062/64 Scope:Link
5           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
6           RX packets:74282478 errors:0 dropped:0 overruns:0 frame:0
7           TX packets:77425890 errors:0 dropped:0 overruns:0 carrier:0
8           collisions:0 txqueuelen:1000 
9           RX bytes:13948947045 (13.9 GB)  TX bytes:51073249506 (51.0 GB)


Example 2 the information in line 3 shows the configuration of the network card, including IP, subnet mask, etc., here can check whether there is mismatch, if this line is displayed incorrectly, it must be that the network card is not properly configured to open.


    • Debian-based Linux (permanent) network configuration files are/etc/network/interfaces,
    • Red Hat-based (permanent) network configuration file for Linux in/etc/sysconfig/network_scripts/ifcfg-<interface>





3. Whether the gateway is set up correctly



If the network adapter has started properly, you need to confirm that the destination network interface is properly configured with the gateway, and that the connection between the host and the gateway is not problematic, and that the route command and the ping command are combined to complete this phase of troubleshooting.



Example 3 using the route command to view the kernel routing table


1 # route  -n
2 Kernel IP routing table
3 Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
4 0.0.0.0         101.111.123.1   0.0.0.0         UG    0      0        0 eth0
5 10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 eth1
6 101.111.123.0   0.0.0.0         255.255.255.0   U     0      0        0 eth0


ROUTE-N displays information such as the gateway in the form of IP instead of hostname, on the one hand, it is faster, on the other hand does not involve DNS, through the route command to view the kernel route, verify that the specific network card is connected to the destination network routing, then you can try to ping the gateway, to troubleshoot the connection with the gateway.



If you cannot ping the gateway, it is possible that the gateway has restricted ICMP packets, or the switch is setting the issue.






4. DNS Work status



Often many network problems are caused by DNS failure or improper configuration, and the Nslookup and dig commands can be used to troubleshoot DNS problems.



Example 4 using the Nslookup command to view DNS resolution


 1 # nslookup baidu.com
 2 Server:        10.21.1.205
 3 Address:    10.21.1.205#53
 4 
 5 Non-authoritative answer:
 6 Name:    baidu.com
 7 Address: 220.181.57.217
 8 Name:    baidu.com
 9 Address: 123.125.114.144
10 Name:    baidu.com
11 Address: 180.149.132.47


Here the DNS server 10.21.1.205 is located in the current LAN, Nslookup results show that DNS is working properly. If the nslookup command cannot resolve the target domain name here, it is most likely that the DNS is improperly configured to see if there is a configuration for the domain name server in the/etc/resolv.conf file:



Example 5 DNS configuration--/etc/resolv.conf file with immediate effect


1 # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
2 #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
3 nameserver 10.21.1.205


The/etc/resolv.conf file is a temporary DNS server configuration that is temporarily in effect and you want to permanently configure the address of the DNS server through the "Dns-nameservers" in/etc/networks/interfaces (Debian-based) field to limit:






Example 6 permanently active DNS configuration--/etc/networks/interfaces file


 1 auto lo
 2 iface lo inet loopback
 3 
 4 auto eth0
 5 iface eth0 inet static
 6         network ...
 7         netmask 255.255.255.0
 8         broadcast ...
 9         gateway ...
10         address ...
11         dns-nameservers 10.21.1.205


If our DNS server is within a subnet and cannot ping it, the DNS server is likely to be down.






5. Whether you can route to a remote host normally



Mutual understanding network is connected by a large number of router relay, the network access is a hop in between these nodes to finally reach the destination, want to see the network connection, the most direct and most commonly used commands are ping,ping, indicating that the route is working properly, but if the ping does not pass, The traceroute command can view the full "hop" process from the current host to the target host. Both the traceroute and the ping commands use the ICMP protocol package.



Example 7. Using Traceroute to track routing status


 1 # traceroute www.baidu.com
 2 traceroute to www.baidu.com (220.181.111.188), 30 hops max, 60 byte packets
 3  1  123.123.123.1 (123.123.123.1)  1.844 ms  1.847 ms  2.102 ms
 4  2  1.1.1.6 (1.1.1.6)  0.389 ms  0.393 ms  0.542 ms
 5  3  localhost (10.1.150.1)  2.556 ms  3.730 ms  3.155 ms
 6  4  localhost (10.12.16.17)  1.214 ms  1.190 ms  1.196 ms
 7  5  localhost (10.12.30.105)  1.533 ms  1.541 ms localhost (10.12.30.101)  1.692 ms
 8  6  202.112.41.37 (202.112.41.37)  3.350 ms  2.998 ms  2.977 ms
 9  7  101.4.112.94 (101.4.112.94)  4.631 ms 101.4.117.82 (101.4.117.82)  3.846 ms 101.4.112.94 (101.4.112.94)  3.808 ms
10  8  101.4.112.89 (101.4.112.89)  3.120 ms  2.844 ms  2.857 ms
11  9  101.4.115.9 (101.4.115.9)  5.957 ms  5.912 ms  4.741 ms
12 10  101.4.117.110 (101.4.117.110)  2.080 ms  2.070 ms  2.036 ms
13 11  202.97.88.229 (202.97.88.229)  35.257 ms 202.97.57.45 (202.97.57.45)  35.373 ms 202.97.57.49 (202.97.57.49)  35.244 ms
14 12  * * *
15 13  * * *
16 14  * 220.181.17.18 (220.181.17.18)  35.869 ms 220.181.182.34 (220.181.182.34)  38.279 ms
17 15  * * *
18 16  * * *
19 17  * * *
20 18  * * *
21 19  * * *
22 20  * * *
23 21  * * *
24 22  * * *
25 23  * * *
26 24  * * *
27 25  * * *
28 26  * * *
29 27  * * *
30 28  * * *
31 29  * * *
32 30  * * *


Looking at line 3rd, the first hop reached the gateway of the current subnet, and then jumped to Australia's Asia-Pacific Network Consulting Center (APNIC) and so on, traceroute can see where the network relay is interrupted or the network latency situation, "*" is because the network is not reachable or a gateway restricts the ICMP protocol packet.






6. Whether the remote host is open port



The Telnet command is a sharp weapon to check the opening of the port, or the Nmap tool,



Example 8. Using Telnet to detect port opening for a remote host


1 # telnet 220.181.111.188 80
2 Trying 220.181.111.188...
3 Connected to 220.181.111.188.
4 Escape character is ‘^]‘.


Telnet IP Port, you can see whether the specified remote host is open target port, here Baidu's front-end server open 80 port is required for Web services.



However, the function of the Telnet command is very limited, when the firewall is present, it is not good to display the results, so telnet can not connect with two possible: 1 is the port does not open, 2 is the firewall filtered connection.



For example, we try to telnet to the 22 port of the Baidu front-end server:


1 telnet 220.181.111.188 22
2 Trying 220.181.111.188...
3 telnet: Unable to connect to remote host: Connection timed out


Can not continue, but we can not determine whether the port is not open, or is blocked by the firewall, the use of NMAP tool will be more powerful:






Example 9. Using the Nmap tool to detect port opening conditions


1 # nmap -p 22 220.181.111.188 
2 
3 Starting Nmap 6.40 ( http://nmap.org ) at 2015-08-10 20:45 CST
4 Nmap scan report for 220.181.111.188
5 Host is up (0.040s latency).
6 PORT   STATE    SERVICE
7 22/tcp filtered ssh


The same server, using nmap detection, observed the 7th line, that the server is actually enabled 22 port, but the firewall filtered packets, if the port is really not enabled, then the 7th row of state will display closed, instead of filtered. Open ports whose status will be open.



As you can see, the port cannot be connected because the port is down or the firewall is filtered.






7. Native View Listening port



If you want to see whether a port is open locally, you can use the following command:


# netstat -lnp | grep PORT


Among them, parameters:


    • -L to display the socket being monitored
    • -P, which shows the process ID and process name that the socket belongs to
    • -N to display the address numerically





Example 10. To view the monitoring of a locally specified port


1 # netstat -lnp | grep :11211
2 Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
3 tcp        0      0 10.0.0.11:11211         0.0.0.0:*               LISTEN      28911/memcached 
4 udp        0      0 10.0.0.11:11211         0.0.0.0:*                           28911/memcached


Example 10 take the memcached service as an example, to view the current active port listening network, if Netstat cannot find the specified port, it indicates that no process is listening on the specified port.



The first column is the socket communication protocol, the 2nd and 3rd columns show the receive and send queues, the 4th column is the local address that the host listens to, reflects the network that the socket listens on, the 6th column shows the status of the current socket, and the last column shows the process that opened the port.






8. View Firewall rules



Use


1 # iptables -L


command to view the current host's firewall, iptables features are not involved here, follow-up Boven detailed introduction.






Second, the network is slow to troubleshoot



Slow network troubleshooting is actually more challenging than network troubleshooting, because many of the times may be the cause of operators, DNS, etc., these failures are often not within our control, can only collect evidence to feedback or complaints.



If you do not want to be affected by DNS, the commands mentioned above can add the-n option, and the-n option prevents the attempt to resolve IP to host name, bypassing DNS.



1. traceroute



The traceroute mentioned above not only can see the correctness of the route, but also can view the delay of each hop in the network, thus locating the network segment with the highest delay.



2. Iftop



The Iftop command is similar to the top command to see which network connections are consuming more bandwidth



Example 11. Use the Iftop command to see the network bandwidth consumed by the connection









Here is a more complete example of a iftop command, the command according to the high and low bandwidth consumption, you can determine those bandwidth-intensive network connections,



The top row scale is the bandwidth ratio of the entire network, the 1th column below is the source IP, the 2nd column is the destination IP, the arrows indicate whether the data is being transferred, and the direction of the transmission. The last three columns are the data transfer rate between the two hosts at 2s, 10s, and 40s respectively.



The bottom TX, RX, respectively, represents the statistics of sending, receiving data, total is the amount of data transmission.


    • Using the-N option to display the connected IP directly, as seen in Example 11, is the result of resolving to a domain name.
    • The-I option allows you to specify which network card to view, by default iftop will display the first network card it finds;
    • After entering the non-interactive interface of Iftop, press p to turn the display port on or off, press S to show or hide the source host, and press the D key to show or hide the target host.


3. tcpdump



When all the troubleshooting means are still unable to find the network slow, packet loss serious reasons, often sacrificed the killer--grab the bag. The best way to grab a packet is to grab both sides of the communication at the same time, so that both the packets sent and the packets received can be checked at the same time, and tcpdump is a common tool for grasping packets.



Example 12. tcpdump example of grasping a package


1 # tcpdump
2 23:47:43.326284 IP ISeR-Server1.ntp > 183.60.211.47.9579: NTPv2, Reserved, length 440
3 23:47:43.326288 IP 58.221.64.43.27777 > ISeR-Server1.ntp: NTPv2, Reserved, length 8


Example 12 only captures the result of the capture of two lines as a signal, you can view the time of communication through the tcpdump, the address of both sides (-n option), port, the purpose of communication, the length of the packet and so on.



When you want to stop grabbing a packet, use CTRL-C to terminate the packet, and Tcpdump will return the number of packets fetched:


1 14422 packets captured
2 1127345 packets received by filter
3 1109698 packets dropped by kernel





Tcpdump has a number of common options, easy to record, tcpdump of the detailed use, here is not introduced, of course, GUI users can also use more professional analysis tools Wireshark.


1 # tcpdump -n port N // Capture only the traffic of a specific port
2 # tcpdump -n port N1 or port N2 // Capture traffic from multiple ports
3 # tcpdump -w output.pcap // Data packet dump, keep the original data packet to output.pcap
#Tcpdump -C 10 -w output.pcap
5 # tcpdump -C 10 -W 5 -w output.pcap // Not only limit the upper limit of each volume, but also limit the total number of volumes
6 # tcpdump -r output.pcap // Replay the saved packet record 


In addition



Brother Bird's Linux private dishes also provide some similar network troubleshooting ideas:



1. Does the NIC work, including hardware and drivers: LSPCI,DMESG



2. The IP parameter is set correctly: Ifconfig



3. Is the communication in the LAN normal: Ping



4. The routing information is normal: Route-n



5. DNS Status: Dig, nslookup



6. Routing node Status and latency: traceroute



7. Service Listening Port: NETSTAT-LNP



8. Firewall: iptables, SELinux



In short, the idea of this article is very consistent.



(GO) linux System Troubleshooting 4--network Chapter


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.