Classic case study of MAC address drift
The network topology is as follows:
IRF virtualization is performed between the two carrier access switches (that is, two switches are virtualized into one), and VRRP hot backup is performed between the two Server Load balancer instances.
The network structure is Layer 2, and the gateways of each link are located at the carrier.
5800-2 port g2/0/11 is connected to the notebook 223.1.5.41
5800-1 port g1/0/3 is connected to mobile ISP (GATEWAY) 223.1.5.1
Problem:
A large number of users reflect the packet loss caused by ping to the mobile Gateway (223.1.5.1) on the mobile line server. The connection is often dropped and the network is unstable.
Now that the problem has occurred, we need to find the fault from the nearest network node. First, a laptop configured with a mobile IP (223.1.5.41) to the S5800-1 to ping the mobile gateway normal; indicating that the optical fiber link from the mobile operator is normal;
Then the notebook to the S5800-2, ping the mobile gateway packet loss, ping the following server normal. The problem lies in data packet loss between the S5800-2 and the S5800-1, and there is only one pair of Optical Fiber connecting the two S5800 for IRF, the problem may be out here, So I replaced the IRF optical fiber and optical fiber module. This is incredible, and the problem persists.
C: \ Users \ Administrator> ping 223.1.5.1-t
Pinging 223.1.5.1 with 32 bytes of data:
Reply from 223.1.5.1: byte = 32 time = 1 ms TTL = 254
Request timed out.
Request timed out.
Request timed out.
Reply from 223.1.5.1: byte = 32 time = 1 ms TTL = 254
Request timed out.
C: \ Users \ Administrator> arp-
Interface: 223.1.5.2 --- 0xb
Internet address physical address type
223.1.5.1 00-00-5e-00-01-65 news
223.1.5.4100-22-15-4c-5d-42 news
This is... impossible. The result of creating a mathematical model based on known conditions is unique. This logical error does not occur. There is only one pair of optical fiber for IRF between the two S5800 connections, and data transmission can only use this pair of IRF optical fiber. If there is no problem with the optical fiber and the optical fiber module, it can only indicate that the data is transmitted to the S5800-1 through the IRF optical fiber, a part of the switch is lost .........
Good! Let's make a traffic statistics to verify this situation:
Telnet 10.10.10.12 \ S5800 IP Address
Sys
Acl number 3876
Rule permit ip source 223.1.5.41 0 destination 223.1.5.1 0
Rule permit ip source 223.1.5.1 0 destination 223.1.5.41 0
Quit
Traffic classifier aaa
If-match acl 3876:
Quit
Traffic behavior aaa
Accounting packet
Quit
Qos policy aaa
Classifier aaa behavior aaa
Quit
Interface GigabitEthernet 2/0/11
Qos apply policy aaa inbound
Qos apply policy aaa outbound
Quit
Interface GigabitEthernet1/0/3
Qos apply policy aaa inbound
Qos apply policy aaa outbound
Quit
Test: We pinged 100 packets in the notebook 223.1.5.41 ping223.1.5.1-n 100 \ and only received 64 packets.
[5800] display qos policy interfaceGigabitEthernet 2/0/11
Interface: GigabitEthernet2/0/11
Direction: Inbound
Policy: aaa
Classifier: aaa
Operator: AND
Rule (s): If-match acl 3876.
Behavior: aaa
Accounting Enable:
100 (Packets)
Direction: Outbound
Policy: aaa
Classifier: aaa
Operator: AND
Rule (s): If-match acl 3876.
Behavior: aaa
Accounting Enable:
64 (Packets)
[5800] display qos policy interfaceGigabitEthernet 1/0/3
Interface: GigabitEthernet1/0/3
Direction: Inbound
Policy: aaa
Classifier: aaa
Operator: AND
Rule (s): If-match acl 3876.
Behavior: aaa
Accounting Enable:
64 (Packets)
Direction: Outbound
Policy: aaa
Classifier: aaa
Operator: AND
Rule (s): If-match acl 3876.
Behavior: aaa
Accounting Enable:
64 (Packets)
Packet Loss in the switch, that is, from the g2/0/11 port of the S5800-2 inbound direction to send 100 packets, to the g1/0/3 port of the S5800-1 outbound direction packet into 64. Where are the remaining 36 data packets? Is it true that the switch is lost inside 5800-1? Good! Let me take you inside the switch to see where the 36 data packets disappear.
[5800-1] en_diag \ enter the hidden Mode
[5800-1] debug port mapping 1 \ Display port corresponding internal port
[Interface] [Unit] [Port] [Name] [Combo?] [Active?] [IfIndex] [MID] [Link] [Attr]
========================================================== ==============================================
GE1/0/41 3 ge2no no 0x0/10 down Bridge
GE1/0/41 2 ge1no no 0x0/20 down Bridge
GE1/0/41 05 ge4 no no0x900002 4 upBridge
..
..
XGE1/0/41 26 xe0no no 0xbc00184 up Bridge
XGE1/0/0 027 xe1no no 0xbc00194 up Bridge
XGE1/0/42 28 xe2no no 0xbc001a4 up Bridge
XGE1/0/41 29 hg0no no 0xbc001b4 up Bridge
The port 5 of the switch is g1/0/3, and port 27 of the switch is XGE1/0/26.
Because the packet forwarding of the L2 Switch is only related to the MAC address, let's see where the MAC address 0x00005e000165 of the mobile gateway is. (You 'd better first learn the principles of the packet forwarding process of a layer-2 switch)
[5800-diagnose] bcm 1 0l2/conflict/mac = 0x00005e000165/vlan = 5
(Slot1) (Layer 2/conflict/mac/vlan)
Conflict: mac = 00: 00: 5e: 00: 01: 65 vlan = 5 modid = 4 port = 5/ge4 SDHit Group = Learnt
[5800-diagnose] bcm 1 0l2/conflict/mac = 0x00005e000165/vlan = 5
Conflict: mac = 00: 00: 5e: 00: 01: 65 vlan = 5 modid = 4 port = 5/ge4 SDHit Group = Learnt
[5800-diagnose] bcm 2 0l2/conflict/mac = 0x00005e000165/vlan = 5
(Slot2) (Layer 2/conflict/mac/vlan)
Conflict: mac = 00: 00: 5e: 00: 01: 65 vlan = 5 modid = 4 port = 5 SDHit Group = Learnt
[5800-diagnose] bcm 2 0l2/conflict/mac = 0x00005e000165/vlan = 5
Conflict: mac = 00: 00: 5e: 00: 01: 65 vlan = 5 modid = 4 port = 27 SDHit Group = Learnt
Note: A total of 4 tests, the first 2 is slot1 that is, in the s5800-1, the MAC address has not been drifting in port = 5;
The last 2 times is in the s5800-2, the MAC address has drift, one is port = 5, and the other is port = 27
Port = 5 (g1/0/3) port = 27 (XGE1/0/26) indicates that mac = 0x00005e000165 appears in g1/0/3 Ports (connected to mobile gateway) respectively in the S5800-2) and XGE1/0/26 ports (connected to the Server Load balancer-1 device ).
How does mac = 0x00005e000165 appear on the server Load balancer-1 device? Are all 36 packet loss packets on the server Load balancer-1 device?
Log on to the Server Load balancer-1 device and find that the virtual MAC address of a group of VRRP (VRID = 101) is actually mac = 0x00005e000165, which is the same as the MAC address of the mobile gateway, what is puzzling is that the configuration of the Server Load balancer device has not been changed for a year. But why does the mobile operator change the MAC address?
To avoid service impact, bind the MAC address of the mobile gateway immediately
Solution:
Bind the MAC of mobile gateway 223.1.5.1 to the g1/0/3 Port
Telnet10.10.10.12 \ log on to S5800
Interface GigabitEthernet1/0/3
Mac-address static packet -5e00-0165 vlan 5
I called the mobile operator and learned that the previous night, the mobile operator added another bras device in the data center and made the master-slave VRRP. The VRID was exactly 101, when VRRP is set up, the MAC address is not random, but from VRID 101 MAC = 2017-5e00-0165, VRID 102 MAC = 2017-5e00-0166 .......... And so on.
However, neither the BRAS device nor the Server Load balancer device has the vrrp method real-mac option to obtain the MAC address of the real interface, which leads to MAC address conflicts ........
Currently, many devices have VRRP hot backup, but they are not configured or do not support the real MAC address function.
Careful friends may have discovered that this is a vulnerability caused by VRRP that can affect large-scale network faults!
Some switch debugging and configuration commands are used in this article, which are hard to be found online, such as the configuration method of traffic statistics and Debugging commands in H3C hidden mode. You can learn from them.
I wrote this article to explain to my friends a network troubleshooting method, that is, the result of establishing a mathematical model based on known conditions is unique, the reason for the non-logical error is that the given known conditions are incorrect !!!