Solution to kernel error when running docker in centos 7.0
Currently, I am running docker in centos 7.0. docker 1.5 is used. Recently, a server always crashes occasionally. Check the logs to find that it is caused by a kernel bug. The error message is as follows:
May1103:43:08ip-10-10-29-201kernel:BUG:softlockup-CPU
#4stuckfor22s![handler20:1542]
May1103:43:08ip-10-10-29-201kernel:Moduleslinked
in
:iptable_natnf_nat_ipv4iptable_filterip_tablesbinfmt_miscipmi_sivfatfatusb_storagempt3sasmpt2sasraid_
classscsi_transport_sasmptctlmptbasedell_rbutcp_diaginet_diagvethbridgestpllcdm_thin_pooldm_persistent_datadm_bio_prisondm_bufioloopdm_modopenvswitchvxl
anip_tunnelgrelibcrc32cxt_natipt_MASQUERADExt_addrtypenf_natxt_limitipt_REJECTnf_conntrack_ipv4nf_defrag_ipv4xt_multiportxt_conntracksgnf_conntrackipmi_de
vintfiTCO_wdtiTCO_vendor_supportdcdbascoretempkvm_intelkvmcrct10dif_pclmulcrc32_pclmulcrc32c_intelghash_clmulni_intelaesni_intellrwgf128mulglue_helperablk_
helpercryptdpcspkrsb_edacedac_coresesenclosureipmi_msghandlertg3wmiacpi_power_meterptppps_coremei_memeintblpc_ichmperfmfd_coreshpchpext4
May1103:43:08ip-10-10-29-201kernel:mbcachejbd2sr_modcdromsd_modcrc_t10difcrct10dif_commonmgag200syscopyareasysfillrectsysimgblti2c_algo_bitdrm_kms_helper
ttmahcidrmlibahcilibatai2c_coremegaraid_sas[lastunloaded:ip_tables]
May1103:43:08ip-10-10-29-201kernel:CPU:4PID:1542Comm:handler20Tainted:GW--------------3.10.0-123.el7.x86_64
#1
May1103:43:08ip-10-10-29-201kernel:Hardwarename:DellInc.PowerEdgeR720
/0X6FFV
,BIOS1.6.003
/07/2013
May1103:43:08ip-10-10-29-201kernel:task:ffff880418adf1c0ti:ffff8800c8d08000task.ti:ffff8800c8d08000
May1103:43:08ip-10-10-29-201kernel:RIP:0010:[<ffffffff815e90e7>][<ffffffff815e90e7>]_raw_spin_lock+0x37
/0x50
May1103:43:08ip-10-10-29-201kernel:RSP:0018:ffff88041fc43ac8EFLAGS:00000206
May1103:43:08ip-10-10-29-201kernel:RAX:000000000000108bRBX:0000000000000000RCX:0000000000000000
May1103:43:08ip-10-10-29-201kernel:RDX:0000000000000002RSI:0000000000000002RDI:ffff88081609c318
May1103:43:08ip-10-10-29-201kernel:RBP:ffff88041fc43ac8R08:ffff8801049856d8R09:ffff88041fc43a00
May1103:43:08ip-10-10-29-201kernel:R10:0000000000000000R11:00000000e1bec8f9R12:ffff88041fc43a38
May1103:43:08ip-10-10-29-201kernel:R13:ffffffff815f2d9dR14:ffff88041fc43ac8R15:ffff88081609c300
May1103:43:08ip-10-10-29-201kernel:FS:00007fb082b8b700(0000)GS:ffff88041fc40000(0000)knlGS:0000000000000000
May1103:43:08ip-10-10-29-201kernel:CS:0010DS:0000ES:0000CR0:0000000080050033
May1103:43:08ip-10-10-29-201kernel:CR2:00007f2a743e6000CR3:00000008183c9000CR4:00000000000407e0
May1103:43:08ip-10-10-29-201kernel:DR0:0000000000000000DR1:0000000000000000DR2:0000000000000000
May1103:43:08ip-10-10-29-201kernel:DR3:0000000000000000DR6:00000000ffff0ff0DR7:0000000000000400
May1103:43:08ip-10-10-29-201kernel:Stack:
May1103:43:08ip-10-10-29-201kernel:ffff88041fc43af8ffffffffa042429fffff88003714be00ffffe8fbefc41540
May1103:43:08ip-10-10-29-201kernel:ffff880419070e80ffff88041fc43b30ffff88041fc43be0ffffffffa04239a4
May1103:43:08ip-10-10-29-201kernel:00000001b9ec8070ffff88003714be00ffff88041fc43b280000000000000246
May1103:43:08ip-10-10-29-201kernel:CallTrace:
May1103:43:08ip-10-10-29-201kernel:<IRQ>
May1103:43:08ip-10-10-29-201kernel:
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa042429f>]ovs_flow_stats_update+0x4f
/0xd0
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa04239a4>]ovs_dp_process_received_packet+0x84
/0x120
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa042a01a>]ovs_vport_receive+0x2a
/0x30
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa042b4cd>]vxlan_rcv+0x6d
/0x90
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa037b228>]vxlan_udp_encap_recv+0xb8
/0x130
[vxlan]
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81538bc2>]udp_queue_rcv_skb+0x162
/0x3d0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815394bd>]__udp4_lib_rcv+0x19d
/0x690
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815094d0>]?ip_rcv_finish+0x350
/0x350
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815399ca>]udp_rcv+0x1a
/0x20
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81509584>]ip_local_deliver_finish+0xb4
/0x1f0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81509858>]ip_local_deliver+0x48
/0x80
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815091fd>]ip_rcv_finish+0x7d
/0x350
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81509ac4>]ip_rcv+0x234
/0x380
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814cfdb6>]__netif_receive_skb_core+0x676
/0x870
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814cffc8>]__netif_receive_skb+0x18
/0x60
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814d0b7e>]process_backlog+0xae
/0x180
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814d041a>]net_rx_action+0x15a
/0x250
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81067047>]__do_softirq+0xf7
/0x290
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815f3a5c>]call_softirq+0x1c
/0x30
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81014d25>]do_softirq+0x55
/0x90
May1103:43:08ip-10-10-29-201kernel:[<ffffffff810673e5>]irq_exit+0x115
/0x120
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815f4358>]do_IRQ+0x58
/0xf0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815e94ad>]common_interrupt+0x6d
/0x6d
May1103:43:08ip-10-10-29-201kernel:<EOI>
May1103:43:08ip-10-10-29-201kernel:
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa0424465>]?ovs_flow_stats_get+0x145
/0x180
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa0424453>]?ovs_flow_stats_get+0x133
/0x180
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa04217b7>]ovs_flow_cmd_fill_info+0x1c7
/0x320
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa0421c5c>]ovs_flow_cmd_build_info.constprop.25+0x6c
/0xa0
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffffa0422155>]ovs_flow_cmd_new_or_set+0x4c5
/0x520
[openvswitch]
May1103:43:08ip-10-10-29-201kernel:[<ffffffff8108ec58>]?__wake_up_common+0x58
/0x90
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814ffcd8>]genl_family_rcv_msg+0x258
/0x3d0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814ffe50>]?genl_family_rcv_msg+0x3d0
/0x3d0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814ffee1>]genl_rcv_msg+0x91
/0xd0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fdf99>]netlink_rcv_skb+0xa9
/0xc0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fe4c8>]genl_rcv+0x28
/0x40
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fd5bd>]netlink_unicast+0xed
/0x1b0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fd9a7>]netlink_sendmsg+0x327
/0x760
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fa874>]?netlink_rcv_wake+0x44
/0x60
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814fb92b>]?netlink_recvmsg+0x1cb
/0x3e0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814b79b0>]sock_sendmsg+0xb0
/0xf0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814b807f>]?sock_recvmsg+0xbf
/0x100
May1103:43:08ip-10-10-29-201kernel:[<ffffffff8109b23e>]?task_scan_min+0x3e
/0x60
May1103:43:08ip-10-10-29-201kernel:[<ffffffff815e908b>]?_raw_spin_unlock_bh+0x1b
/0x40
May1103:43:08ip-10-10-29-201kernel:[<ffffffff814b7de9>]___sys_sendmsg+0x3a9
/0x3c0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff811f7fa9>]?ep_scan_ready_list.isra.9+0x1b9
/0x1f0
May1103:43:08ip-10-10-29-201kernel:[<ffffffff811f8123>]?ep_poll+0x123
/0x370
May1103:43:08ip-10-10-29-201kernel:[<ffffffff81079af3>]?getrusage+0x43
/0x70
May1103:43:09ip-10-10-29-201kernel:[<ffffffff814b8cd1>]__sys_sendmsg+0x51
/0x90
May1103:43:09ip-10-10-29-201kernel:[<ffffffff814b8d22>]SyS_sendmsg+0x12
/0x20
May1103:43:09ip-10-10-29-201kernel:[<ffffffff815f2119>]system_call_fastpath+0x16
/0x1b
May1103:43:09ip-10-10-29-201kernel:Code:0200f00fc10789c2c1ea106639c275025dc383e2fe0fb7f2b800800000eb0c0f1f440000f39083e8017
40a<0f>b70f6639ca75f15dc366666690666690ebda660f
It is found that this problem is a kernel bug by querying stackoverflow. The solution is to upgrade the kernel.
The following describes how to upgrade the kernel of centos 7.0 to version 4.0.2 by default.
1. Import the authentication key of the yum Source
rpm--
import
https:
//www
.elrepo.org
/RPM-GPG-KEY-elrepo
.org
2. Install the yum Source
rpm-Uvhhttp:
//www
.elrepo.org
/elrepo-release-7
.0-2.el7.elrepo.noarch.rpm
3. Install the new kernel
In the ELRepo source of yum, the kernel version mainline (4.0.2) exists.
[root@ip-10-10-29-201~]
#yum--enablerepo=elrepo-kernelinstallkernel-ml-develkernel-ml
Loadedplugins:fastestmirror
MooseFS|951B00:00:00
base|3.6kB00:00:00
elrepo|2.9kB00:00:00
elrepo-kernel|2.9kB00:00:00
extras|3.4kB00:00:00
updates|3.4kB00:00:00
(1
/2
):elrepo
/primary_db
|233kB00:00:02
(2
/2
):elrepo-kernel
/primary_db
|782kB00:00:04
MooseFS
/primary
|4.2kB00:00:00
Loadingmirrorspeedsfromcachedhostfile
*base:mirrors.yun-idc.com
*elrepo:repos.lax-noc.com
*elrepo-kernel:repos.lax-noc.com
*extras:mirror.bit.edu.cn
*updates:mirror.bit.edu.cn
MooseFS30
/30
ResolvingDependencies
-->Runningtransactioncheck
--->Packagekernel-ml.x86_640:4.0.2-1.el7.elrepowillbeinstalled
--->Packagekernel-ml-devel.x86_640:4.0.2-1.el7.elrepowillbeinstalled
-->FinishedDependencyResolution
DependenciesResolved
==========================================================================================================================================================================
PackageArchVersionRepositorySize
==========================================================================================================================================================================
Installing:
kernel-mlx86_644.0.2-1.el7.elrepoelrepo-kernel36M
kernel-ml-develx86_644.0.2-1.el7.elrepoelrepo-kernel9.5M
TransactionSummary
==========================================================================================================================================================================
Install2Packages
Totaldownloadsize:45M
Installedsize:199M
Isthisok[y
/d/N
]:y
Downloadingpackages:
(1
/2
):kernel-ml-4.0.2-1.el7.elrepo.x86_64.rpm|36MB00:00:11
(2
/2
):kernel-ml-devel-4.0.2-1.el7.elrepo.x86_64.rpm|9.5MB00:00:31
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total1.5MB
/s
|45MB00:00:31
Runningtransactioncheck
Runningtransaction
test
Transaction
test
succeeded
Runningtransaction
Warning:RPMDBalteredoutsideofyum.
Installing:kernel-ml-devel-4.0.2-1.el7.elrepo.x86_641
/2
Installing:kernel-ml-4.0.2-1.el7.elrepo.x86_642
/2
Verifying:kernel-ml-4.0.2-1.el7.elrepo.x86_641
/2
Verifying:kernel-ml-devel-4.0.2-1.el7.elrepo.x86_642
/2
Installed:
kernel-ml.x86_640:4.0.2-1.el7.elrepokernel-ml-devel.x86_640:4.0.2-1.el7.elrepo
Complete!
4. view the current kernel version
[root@ip-10-10-29-201~]
#uname-r
3.10.0-123.el7.x86_64
Important: The current kernel is still the default version. If you reboot it directly after this step is completed, the kernel version used after the restart is still the default version 3.10, and the new version 4.0.2 will not be used, to modify the startup sequence, proceed to the next step.
View default startup sequence
[root@ip-10-10-29-201~]
#awk-F\''$1=="menuentry"{print$2}'/etc/grub2.cfg
CentOSLinux(4.0.2-1.el7.elrepo.x86_64)7(Core)
CentOSLinux,withLinux3.10.0-123.el7.x86_64
CentOSLinux,withLinux0-rescue-18b184aa09434ecf9739a70c6b63638a
The default boot sequence starts from 0, but the new kernel is inserted from scratch (currently at 1, while 4.0.2 is at 0). Therefore, select 0, to take effect of the latest kernel, you must
[root@ip-10-10-29-201~]
#grub2-set-default0
5. Restart
Reboot
6. Check the kernel after restart
[root@ip-10-10-29-201conf]
#uname-r
4.0.2-1.el7.elrepo.x86_64
This problem did not occur within 20 days after the upgrade. Therefore, the file is determined to be caused by a kernel bug and is resolved by upgrading the kernel.