How Nova is counting OpenStack compute resources

Source: Internet
Author: User
Tags disk usage

Introduction

OPS colleagues often encounter these four questions:

    • How does Nova count OpenStack computing resources?
    • Why FREE_RAM_MB, FREE_DISK_GB will be negative?
    • Even if FREE_RAM_MB, FREE_DISK_GB is negative, why is the virtual machine still able to create success?
    • The use of random scheduling due to insufficient resources will cause the virtual machine to fail, while the specified host can be created successfully?

This article will start with the above four questions, combined with Kilo version of the Nova source code, the default Hypervisor for QEMU-KVM (different Hypervisor of resource statistics), uncover the veil of OpenStack statistical resources and resource scheduling.

What resources does Nova need to count

The essence of cloud computing is the software of hardware resources to achieve a fast on-demand delivery model. The most basic elements of computing, storage, and networking have not changed. CPU, RAM, and disk are still essential core resources in terms of computing.

From the source and database related tables, it can be concluded that the compute resources of a compute node for Nova statistics can be divided into four categories

    • CPU: Includes Vcpus (the node physical CPU bus path number), vcpus_used (the node all virtual machine Vcpus sum)
    • RAM: Includes MEMORY_MB (total RAM for this node), memory_mb_used (sum of all virtual machine RAM for that node), FREE_RAM_MB (remaining disk space)
      NOTE:MEMORY_MB = memory_mb_used + FREE_RAM_MB
    • DISK:LOCAL_GB (total disk space allocated to the virtual machine), local_gb_used (virtual machine used disk space), FREE_DISK_GB (remaining disk space)
      NOTE:LOCAL_GB = local_gb_used + FREE_DISK_GB
    • Other: PCI devices, CPU topologies, NUMA topologies, and Hypervisor information.

This article focuses on CPU, RAM, and DISK three resources.

How Nova collects Resources

As you can see from the source, Nova counts resources once per minute in the following ways:

    • Cpu
      • Vcpus:libvirt in Get_info ()
      • Vcpu_used: The sum of all VMS on the node is counted by Dom.vcpus () in Libvirt
    • Ram
      • Memory:libvirt in Get_info ()
      • Memory_mb_used: The available memory is calculated by/proc/meminfo, and then the total memory minus the available memory
    • DISK
      • LOCAL_GB:OS.STATVFS (Conf.instances_path)
      • LOCAL_GB_USED:OS.STATVFS (Conf.instances_path)
    • Other
      • Hypervisor Related: All obtained through Libvirt
      • Pci:libvirt in Listdevices (' PCI ', 0)
      • Numa:livirt in Getcapabilities ()

Then the question comes, according to the way of collecting resources above, FREE_RAM_MB, FREE_DISK_GB can't be negative! Don't worry, Nova-compute. Before the resource is escalated to the database, a resource statistic is also made based on the virtual machine on that node.

Nova Resource Re-statistics

First analyze why you need to re-count resources and what resources to count again. From the source, Nova has again counted RAM, DISK, and PCI resources based on the virtual machines on that node.

Why do you want to count RAM resources again? To start a virtual machine with a memory size of 4G, for example, before and after the virtual machine starts, we compare the available memory on the host, and find that the free memories on the host are reduced (this test only reduces the 1G), but not to 4G, if at this time the virtual machine runs very memory-eating applications, The available memory on the host can be found to be rapidly reduced. Imagine a 64G server, for example, assuming that after each virtual machine (4G) startup, the host only reduces 1G of memory, the server can successfully create 64 virtual machines, but when these virtual machines run a large number of business, the server's memory is rapidly insufficient, light impact on the virtual machine efficiency, the heavy causes virtual machine shutdown and so on. In addition, the memory on the host is not completely assigned to the virtual machine, and applications such as the system itself need to use memory resources. Therefore, the RAM resource must be re-counted in the following way:
Free_memory = total_memory-conf.reserved_host_memory_mb-Virtual machine theory memory sum
CONF.RESERVED_HOST_MEMORY_MB: Memory reservation, such as reserved for other applications
Virtual machine theory memory sum: The sum of memory in all virtual machine flavor

Why do you want to re-count DISK resources? The reason is roughly the same as RAM. In order to save space, QEMU-KVM commonly used QCOW2 format image (see Copy on Write), in order to create a disk size of 100G virtual machine as an example, when the virtual machine is created, look for the virtual machine image file size, found that often only hundreds of KB, when the virtual machine has a large amount of data to write to the disk , the corresponding virtual machine image files on the host will grow rapidly. The OS.STATVFS statistic is the current disk usage, and does not reflect the potential use of the disk. DISK resources must therefore be re-counted in the following ways:
FREE_DISK_GB = local_gb-conf.reserved_host_disk_mb/1024-Virtual machine theory disk sum
CONF.RESERVED_HOST_DISK_MB: Disk Reservation
Virtual machine theory disk sum: The sum of the disks in all virtual machine flavor

When the resource is allowed to be over-provisioned, it is possible to use the above statistic method to appear free_ram_mb, FREE_DISK_GB is negative.

Resource hyper-provisioning and scheduling

Even if the FREE_RAM_MB or FREE_DISK_GB is negative, the virtual machine is still possible to create successfully. In fact, when Nova-scheduler is in the scheduling process, some filter allows resources to be matched, such as the CPU, RAM, and DISK filter, their default super-match is:

    • CPU:CONF.cpu_allocation_ratio = 16
    • RAM:CONF.ram_allocation_ratio = 1.5
    • DISK:CONF.disk_allocation_ratio = 1.0

In Ram_filter, for example, when filtering a host from RAM, the principle of filtering is:
Memory_limit = total_memory * Ram_allocation_ratio
Used_memory = Total_memory-free_memory
When Memory_limit-used_memory < flavor[' Ram ', indicates that the host is not running out of memory, or that the host is kept.

The relevant code is as follows (slightly streamlined):

def  host_passes  :   Requested_ram = Instance_type[ ' MEMORY_MB ' ] free_ram_mb = HOST_STATE.FREE_RAM_MB tot    AL_USABLE_RAM_MB = HOST_STATE.TOTAL_USABLE_RAM_MB Memory_mb_limit = total_usable_ram_mb * CONF.ram_allocation_ratio USED_RAM_MB = total_usable_ram_mb-free_ram_mb Usable_ram = memory_mb_limit-used_ram_mb if  not  usable_ram >= requested_ram:LOG.debug (" host does not has Requested_ram "" return  false   

In the last section we know that the use of RAM and disk on the host is often less than the RAM and disk used by the virtual machine theory, and there is actually the remaining RAM and disk, and Libvirt will successfully create the virtual machine under sufficient remaining resources.

Caprice: Memory and disk over-provisioning although it can provide more virtual machines, when all the VMS on the host load is very high, light impact on the virtual machine performance, heavy is caused QEMU-KVM related process kill, that is, the virtual machine is shut down. Therefore, for business with high stability requirements, it is recommended that you do not exceed the RAM and DISK, and that the CPU is properly provisioned. It is recommended that these parameters be set to:

    • CPU:CONF.cpu_allocation_ratio = 4
    • RAM:CONF.ram_allocation_ratio = 1.0
    • DISK:CONF.disk_allocation_ratio = 1.0
    • RAM-RESERVE:CONF.RESERVED_HOST_MEMORY_MB = 2048
    • DISK-RESERVE:CONF.RESERVED_HOST_DISK_MB = 20480
Specify host to create a virtual machine

This section is used to answer question four, when all the host's resources are used too much, that is, when the qualified over-provisioning value is exceeded (Total_resource * allocation_ratio), the Nova-scheduler will filter the host, and when no host is found that meets the requirements, The virtual machine will fail to create.

The API for creating a virtual machine supports specifying host to create a virtual machine, and when you specify host, Nova-scheduler takes a special approach: no longer determines whether the resource on the host meets the requirements, and sends the request directly to the Nova-compute on that host.
The relevant code is as follows (slightly streamlined):

 def get_filtered_hosts(self, hosts, filter_properties, Filter_class_names=none, index =0):    "" " Filter hosts and return only ones passing all filters. " "" "...ifIgnore_hostsorForce_hostsorForce_nodes: ...ifForce_hostsorForce_nodes:# Note (deva): Skip filters when forcing host or node            ifName_to_cls_map:returnName_to_cls_map.values ()returnSelf.filter_handler.get_filtered_objects ()

Libvirt can still successfully create virtual machines when the requirements are met when the resources are actually available on the host.

How Nova is counting OpenStack compute resources

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.