Ceph deadlock failure under high IO

Source: Internet
Author: User

Ceph deadlock failure under high IO

On a high-performance PC server, ceph is used for VM image storage. In the case of stress testing, all virtual machines on the server cannot be accessed.

Cause:

1. A website service is installed on the virtual machine, and redis is used as the cache server in the website service. When the pressure is high (8000 thousand accesses per second), all the VMS on the host machine cannot be accessed.

2. In the event of a fault, some virtual machines cannot be pinged, and some virtual machines can be pinged, but cannot be logged on through ssh.

At first, we thought it was a bridge fault. The NIC fault of KVM's virtio is very famous. When a bridge is used, memory overflow may occur. This causes the bridge to fail. The solution provided by Xen is to disable the tso support of the bridge.

(Run the command ethtool -- offload <network device> tso off)

However, after the network service is restarted, the fault does not disappear.

Therefore, the bridge fault is eliminated.

After repeated failures, the ssh of a virtual machine is not disconnected, so the cd command can be executed, but the ls command cannot be executed, and the input/output error is reported, this error is a file system fault.

So I began to suspect that there was a problem with the file system.

This file system is ceph. Check the ceph log and find that ceph reports a large number of fault logs when a fault occurs:

16:36:28. 493424 osd.0 172.23123123: 6800/96711 9195: cluster [WRN] 6 slow requests, 6 supported ded below;

Oldest blocked for> 30.934796 secs

And

18:46:45. 192215 osd.2 172.132131231: 6800/68644 5936: cluster [WRN] slow request 240.415451 seconds old

, Sorted ed at 18:42:44. 776646: osd_op (13213213500 [

Stat, set-alloc-hint object_size 4194304 write_size 4194304, write 2269184 ~ 524288] 0.5652b278 ack + ondisk + write + kno

Wn_if_redirected e48545) currently waiting for rw locks

There is a deadlock.

Check disk IO records and find that the redis server has a large number of disk write operations when a fault occurs. It is found that rbd persistence is frequently triggered at a high operating frequency, as a result, a large number of disk io occurs. These disk IO results in insufficient write time for other disk operations, resulting in a ceph deadlock on osd.

The solution is to disable the rbd persistence of redis.

A long-term solution is to prevent redis from writing data to the ceph partition persistently. In addition, do not write or read high IO images from the ceph Virtual Machine (unreliable ...)

Experience summary:

1. Ceph has the risk of deadlocks under high IO. Ceph does not provide an unlock mechanism. The official solution is not to place Virtual Machine images on ceph... Speechless ..

2. The storage network and business network should be isolated and separated during system design. A system service can be divided into the Internet, business network, storage network, heartbeat network, management network, and five network forms.

-------------------------------------- Split line --------------------------------------

Ceph environment configuration document PDF

Deploying Ceph on CentOS 6.3

Ceph Installation Process

HOWTO Install Ceph On FC12 and FC Install Ceph Distributed File System

Ceph File System Installation

CentOS 6.2 64-bit installation of Ceph 0.47.2

Ubuntu 12.04 Distributed File System (Ceph)

Install Ceph 0.24 on Fedora 14

-------------------------------------- Split line --------------------------------------

Ceph details: click here
Ceph: click here

This article permanently updates the link address:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.