The problem of the host working environment with the xenserver6.5 cluster, one day up suddenly found a VM not connected, thinking that it went up XenServer restart the virtual machine, the result of forced restart can not be successful, go to host query disk space
[Email protected] cron.d]# df-hfilesystem Size used Avail use% mounted on/dev/sda1 20G 20G 0 100%/none 7.8G 2.0M 7.8G 1%/DEV/SHM
found that the host disk space is full, OK, clear disk space bar, the results of the following command to find
[[Email protected] /]# cd /[[email protected] /]# du -sh *5.7m bin24M boot2.1M cli-rt3.3M dev7.4M etc28K EULA4.0K home118M lib20M lib6416K lost+found4.0K media4.0K mnt554M optdu: cannot read directory ' proc/7020 ': no such file or directorydu: cannot read directory ' proc/7021 ': no such file or Directory0 proc12k read_me_ first.html102m root24m sbin4.0k selinux4.0k srv0 sys1.6m tftpboot68k Tmp542m usr2.6g var
OK, disk space is not full, then what to do, where the other space, think of it should be deleted from the free space files caused, and then execute the following command to see which files are deleted is still in use
[[Email protected] cron.d]# ls -l /proc/[0-9]*/fd/* |grep delete ls: /proc/29018/fd/255: no such file or directoryls: /proc/29018/fd/3 : no such file or directoryl-wx------ 1 root root 64 Nov 14 13:14 /proc/22020/fd/2 -> /tmp/stunnelbd3855.log ( deleted) L-WX------ 1 root root 64 nov 14 13:27 /proc/24758/fd/2 -> /tmp/stunnel1bc930.log (deleted) lrwx------ 1 root root 64 nov 14 11:03 /proc/4555/fd/6 -> /tmp/ tmpflfgwgg (deleted) lrwx------ 1 root root 64 Nov 14 11:03 /proc/4556/fd/6 -> /tmp/tmpflfgwgg (deleted) l-wx------ 1 root root 64 nov 14 11:03 /proc/4587/fd/5 -> /var/run/openvswitch/ ovs-xapi-sync.pid.tmp4587 (deleted) l-wx------ 1 root root 64 nov 14 11:03 /proc/4587/fd/12 -> /var/log/blktap/tapdisk.2345.log (Deleted)
Try a lap, the last most likely is/var/log/blktap/tapdisk.2345.log (deleted) This file
Tapdisk.2345.log This file description file is a log file with a Tapdisk process ID of 2345, the main record tapdisk monitoring disk image logging, such as the following log records
Aug 21 17:55:06: [17:55:06.597] tapdisk_vbd_check_progress: vhd:/dev/vg_ xenstorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/vhd-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 secondsaug 21 17:55:06: [ 17:55:06.597] tapdisk_vbd_check_progress: vhd:/dev/vg_ xenstorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/vhd-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 secondsaug 21 17:55:06: [ 17:55:06.921] tapdisk_vbd_check_progress: vhd:/dev/vg_ xenstorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/vhd-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 secondsaug 21 17:55:06: [ 17:55:06.925] tapdisk_vbd_check_progress: vhd:/dev/vg_ xenstorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/vhd-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 seconds
Then the Xen virtual machine hangs, will cause the first problem, unable to restart the virtual machine, the host disk space is full, log files are deleted?
The answer is that after the virtual machine hangs, the tapdisk process of the VM on the host keeps brushing the log until the disk is maxed out, causing the virtual machine to restart, because the host's disk space is full. However, if the log size exceeds the size of the log scrolling triggered, the log has a backup operation, and after scrolling just better than the preset maximum number of reserved limit, the file will be deleted
[[email protected] /]# rpm -vv elasticsyslog........ c /etc/ cron.d/logrotate.cron........ c /etc/logrotate-xenserver.conf........ / etc/sysconfig/syslog.elastic........ /etc/sysconfig/syslog.patch........ /opt/xensource/bin/delete_old_logs_by_space........ /opt/xensource/bin/ elasticsyslog........ /opt/xensource/bin/logrotate-xenserver........ /opt/xensource/bin/rotate_logs_by_size[[email protected] /]# cat /etc/logrotate.conf # see "Man logrotate" for details# rotate log files weeklyweekly# keep 4 weeks worth of backlogsrotate 4# create new (empty) log files after rotating old onescreate# uncomment this if You want your log files compressed#compress# rpm packages drop log rotation Information into this directoryinclude /etc/logrotate.d# no packages own wtmp -- we ' ll rotate them here/var/log/wtmp { Monthly minsize 1m create 0664 root utmp rotate 1}/var/log/btmp { missingok monthly minsize 1M create 0600 root Utmp rotate 1}# system-specific logs may be also be configured here.
Said so much, the solution is also very simple, is to release the process of taking the deleted files, see the above/var/log/blktap/tapdisk.2345.log (deleted), the process number is 2345, kill it
[[email protected] /]# ps -ef |grep 2345root 18165 15432 0 14:22 pts/37 00:00:00 grep 21611root 2345 1 0 jun01 ? 03:10:55 tapdisk[[email protected ] /]# kill 2345[[email protected] /]# df -hfilesystem size used avail use% mounted on/dev/sda1 20G 4.1G 15G 22% /none 7.8G 2.0M 7.8G 1% /dev/shm
Well, see the space out, this time, you will see the host is back to normal, because there is disk space, we originally hung off the virtual machine has been shut down.
Then, start the virtual machine, if you are a cluster of virtual machines, it is the simplest, on the other host on the boot can be, if you are a single virtual machine, or want to boot on the original host, then you need to start tapdisk, here need a number, before you kill the virtual machine process, it is best to remember that There is no good way to execute the following command, save, wait until the kill process is executed, then execute the following command, you can find the boot Tapdisk worker process that should be virtual machine
#查看所有的tapdisk进程 #ps-ef |grep Tap # to start the VM's own tapdisk process, note that the 8 here is my comparison with the execution Ps-ef |grep tap before and after kill, not fixed #tapback-D-X 18
After you start the Tapdisk process for the VM, you can start the virtual machine normally.
The following is the supply, explain what is tapdisk, can give a friend in need, my English is also able to read only the level of understanding, it is not caught dead translation:
Url:https://wiki.xen.org/wiki/blktap
Tapdisk, each tapdisk process in userspace are backed by one or several image files
When Xend is started the userspace daemon Blktapctrl is started, too. When booting the Guest VM the xenbus is initialized as described In xensplitdrivers. The request for a new virtual disk was propagated to Blktapctrl, which creates a new character device and both named Pipes F Or communication with a newly forked tapdisk process.
After opening the character device, the shared memory is Mapped to the fe_ring using the-mmap system call. The Tapdisk process opens the image file and sends information about the Imageas size back to Blktapctrl, which stores it. After this initialization tapdisk executes a select system call on the named pipes. On an event it checks if the TAP-FD are set and if it is, tries to read a request from the frontend ring.
The Xenbus connection between DomU and Dom0 are used by Xenstore to negotiate the Backend/frontend connection. After the setup of both backend and frontend a GKFX ring page and an event channel is negotiated. These is used for any further communication between backend and Frontend. I/O requests issued in the guest VM is handled in the guest OS and forwarded using these both communication channels.
There is a trade-off between delay and throughput which are controlled by modifying the number of requests until the BLKTAP Driver is notified.
The Blktap driver notifies the appropriate Blktapctrl or tapdisk process depending on the event type by returning the poll and waking up the tapdisk process respectively. The shared frontend ring works as described in the ring.h.
Tapdisk reads the request from the frontend ring and in case of synchronous I/O reads and immediately returns the request. In case of asynchronous I/O A batch of requests is submitted to Linux AIO subsystem. Both mechanisms read from the image file. In the asynchronous case it's checked using the non-blocking system call io_getevents if the I/O requests were.
The information about completed requests are propagated in the frontend ring. The Blktap driver is notified by the Tapdisk process and the IOCTL system call.
Using the same xensplitdevices mechanism the data is returned to the frontend of the Guest VM.
650) this.width=650; "src=" Https://wiki.xen.org/images/0/06/Blktap%24blktap_diagram_differentSymbols.png "alt=" Blktap$blktap diagram Differentsymbols.png "/>
This article is from the "Nano Dragon" blog, please be sure to keep this source http://arlen.blog.51cto.com/7175583/1872634
Xen virtual machine hangs, host the problem of suspended death, the whole idea