How do I resolve common problems that arise in VMware ESX?

Source: Internet
Author: User

Here, we record some fairly common problems that occur on ESX hosts. Usually, some simple steps can be taken to solve these problems, but some problems require more in-depth solutions.

Purple screen Panic (psods, Purple screens of death)

There is a possible failure on both the ESX and ESXi hosts, called the Purple Screen panic (which can be said to be the VMware version of the infamous Microsoft Blue screen crash). A purple screen crash can cause the esx/or ESXi host to suddenly crash and become unusable. Purple screen death phenomenon 10.6, you must not want to happen on their own host this phenomenon.

Figure 10.6 on the ESX host on the purple screen of the panic phenomenon

When psods occurs, esx crashes completely, without any response. Hardware problems (bad memory is the most common cause) or bug in ESX is a typical cause of psods. When psods occurs, you can only shut down and restart the host. The information on the screen is very useful, and you should try to record it: You can take a photo with a phone with a camera, or, if it exists, from a remote admin panel. You may not understand the captured information, but this information is useful for VMware's technical support. The information displayed on the screen includes the ESX version and build number, the type of exception, the Register dump (register dump), what each CPU is running at the time of the crash, backtracking (back-trace), server uptime, error logs, memory hardware information, and so on.

After you encounter Psods and restart the host, there will be a file that starts with Vmkernel-zdump (named) under the ESX host or/root folder. This file is useful for VMware technical support, and you can use it to extract Vmkernel log information from the Vmkdump tool and find clues about psods to determine the cause of the psods. To use this command, enter Vmkdump–l dump < file name >. As mentioned earlier, bad memory is a common cause in psods. You can use the dump file to identify the memory module that caused the problem and replace it.

If you suspect that bad memory is causing psods, you can use some memory pressure test tools to detect the memory of the host. These tools require you to shut down the host and boot from the CD for memory testing. A common tool is memtest86+, which can perform extensive memory testing, such as detecting the interplay of neighboring memory units, to ensure that a cell is not overwritten when it is written. You can download this tool in www.memtest.org.

When you first deploy ESX on a host, it's a good idea to have a memory test, which avoids the hassle of a memory failure at some point in the future. Many memory problems are not obvious, and simple memory tests, such as memory checks during post, may not be able to find the problem. You can download the memtest86+ test tool, a 2MB-size ISO file, burn the file to a CD, let the host boot from the CD, and run the tool for at least 24 hours to complete a variety of memory tests. The larger the memory in the host, the longer it takes to complete a test, and a host with 32GB of memory will take about 1 days to complete the test. In addition to system memory, memtest86+ also detects CPU L1 and L2 cache. The run time of the memtest86+ is uncertain, and when all the tests are complete, the passed counter increases. (memtest86+ would run indefinitely, and the pass counter would increment as all the tests is run.)

Server Console Issues

Sometimes, you may encounter a server console issue that appears to be: The server console hangs, and you are not allowed to log on locally. This condition may be caused by a hardware lock or zombie state, but typically this problem does not affect the virtual machines (VMS) running on the ESX host. Restarting is a common way to solve this problem, but before restarting, you should either shut down the VM or vmmotion the VM to another ESX. You can use a variety of available pathways to operate VMS, complete shutdown or migration. For example, use VI client, log in to the service console via SSH, or use an alternative/emergency console (by pressing ALT+F2 to F6). After the VM is migrated or shut down, you can use the reboot command to restart ESX, if all the consoles are not responding, you can only go to press the power button on the host, Cold boot host.

Network problems

Sometimes, you may also experience some kind of failure that could result in the loss of all or part of the network configuration, or a change in network configuration that causes the service console to lose network connectivity. When the service console's network connection is lost, you will not be able to use the remote method to connect to the ESX host, including VI client and SSH. You can only use the esxcfg-command-line tool to recover/remediate the network configuration on the Local Service console, and here are some commands you could use to configure the network from the ESX CLI:

Esxcfg-nics
 
This command displays a list of physical network cards, removing the driver information for each NIC, the PCI device, and the link status, which you can use to control the speed, duplex mode, etc. of the physical network card. ESXCFG-NICS–L Display network card information, ESXCFG-NICS–H display the options available for this command, here are some examples:

o Set the physical NIC Vmnic2 speed and Duplex mode to 100/full:
Esxcfg-nics-s 100-d Full Vmnic2
o Set the physical NIC Vmnic2 speed and duplex mode to adaptive mode:
Esxcfg-nics-a Vmnic2

Esxcfg-vswif

Create or update the service console network, including the IP address and port group, ESXCFG-VSWIF–L display the current settings, ESXCFG-VSWIF–H display the available options, here are some examples:

o Change the service console (VSWIF0) IP address and subnet mask:
Esxcfg-vswif-i 172.20.20.5-n 255.255.255.0 vswif0
o Add Service Console (VSWIF0):
Esxcfg-vswif-a vswif0-p "Service Console"-I 172.20.20.40-n 255.255.255.0

Esxcfg-vswitch

Create or update virtual machine networks (VSwitch), including uplinks, port groups, and VLAN IDs. Enter Esxcfg-vswitch–l to display the current vswitch, esxcfg-vswitch–h Show all available options. Here are some examples:

o Add the physical network card (VMNIC2) to the Vswitch (VSWITCH1):
Esxcfg-vswitch-l Vmnic2 Vswitch1
o Remove the physical NIC (VMNIC3) from a vswitch (vSwitch0):
Esxcfg-vswitch-u vmnic3 Vswitch0
o Add a port group (VM Network3) on Vswitch (VSWITCH1):
Esxcfg-vswitch-a "VM Network 3" vSwitch1
o Assign a VLAN ID (3) to the port group (VM Network 3) on Vswitch (VSWITCH1):
Esxcfg-vswitch-v 3-p "VM Network 3" vSwitch1

Esxcfg-route

Sets or extracts the default Vmkernel gateway route. Enter Esxcfg-route–l current routing information, esxcfg-route-h display all available options. Here are some examples:

o Set the default Vmkernel gateway route:
Esxcfg-route 172.20.20.1
o Add a route to Vmkernel:
Esxcfg-route-a default 255.255.255.0 172.20.20.1

Esxcfg-vmknic

Create or update vmkernel TCP/IP settings for vmotion, Nas, and iSCSI. Enter esxcfg-vmknic–l display Vmkernel NICs, esxcfg-vmknic-h Show all available options. Here are some examples:

o Add a vmkernel nic and set the IP and subnet mask:
Esxcfg-vmknic-a "VM Kernel"-I 172.20.20.19-n 255.255.255.0

In addition, you can restart the service console network via the Services Network Restart command.

Other questions

Sometimes, restarting some ESX services solves the problem and does not affect the operation of the VM. Two services that can be restarted and often resolve problems are hostd and VPXA. The HOSTD service running in the service console is responsible for managing most of the operations on ESX, restarting the HOSTD service, logging into the service console, and entering service Mgmt-vmware restart.

The VPXA service is a management agent that handles communication between the host and the client, including the vcenter server and any VI client connected to the ESX. If you see a host on vcenter server showing disconnected but not displaying the current information, or any other strange problem involving vcenter server and a host, it can be resolved by restarting the VPXA service. To start the service, log in to the service console and enter the services VMWARE-VPXA restart. When you have a problem, it is recommended that you restart both services, because restarting them often solves many problems.

Ding Feng Hu Jiashong
qq.2881064155
skype.live:2881064155

How do I resolve common problems that arise in VMware ESX?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.