Linux kernel support for CPU hot-swappable __linux

Source: Internet
Author: User
Tags echo command

CPU hotplug Support in Linux (tm) Kernel

Maintainers:

CPU Hotplug Core:

Rusty Russell <rusty@rustcorp.com.au>

Srivatsa Vaddagiri <vatsa@in.ibm.com>

I386:

Zwane Mwaikambo <zwane@arm.linux.org.uk>

PPC64:

Nathan Lynch <nathanl@austin.ibm.com>

Joel Schopp <jschopp@austin.ibm.com>

IA64/X86_64:

Ashok Raj <ashok.raj@intel.com>

s390:

Heiko Carstens

Authors:ashok Raj <ashok.raj@intel.com>

Lots of Feedback:nathan Lynch <nathanl@austin.ibm.com>

Joel Schopp jschopp@austin.ibm.com

Translation: Arethe Qin

Introduction

Modern advanced features on the system architecture make the processor capable of error reporting and error correction. The CPU architecture supports partitioning, which enables the computing resources of a single CPU to meet the needs of virtual machines. Some OEMs have already supported the hot swap of NUMA hardware, and the insertion and removal of physical nodes requires processor hot-swappable technology support.

This advanced feature requires the kernel to be able to remove the CPU that is in use when necessary. [Provisioning reasons?] For example, for the needs of RAS, a CPU that executes malicious code must remain outside the system execution path. Therefore, it is necessary to support CPU hot plug technology in the Linux kernel.

A more innovative application of CPU hot-swappable is support for the suspension/recovery of SMP systems. multi-core or HT technology enables SMP cores to be run on a single laptop, but the current SMP technology that supports hangs/restores is still being developed.

General overview of CPU hot-swappable

Command-line settings

Maxcpus=n

The number of CPUs on the system boot is limited to n. If you have 4 CPUs, but maxcpus=2, you can only start 2 CPUs. But then more CPUs can be added to the system, and more information can be obtained in the FAQ.

Additional_cpus=n (*)

This option allows you to limit the number of hot-swappable CPUs. With this option, we can calculate the maximum number of CPUs that the system can support:

Cpu_possible_map = Cpu_present_map + Additional_cpus

Cede_offline = {' Off ', ' on '}

On an extended pseries platform, this option disables/enables the ability to set an offline processor to an extended h_cede state. If not specifically stated, Cede_offline is set to "on" by default.

The option to add (*) applies only to the following platforms:

-ia64

Ia64 determines the number of potential hot swappable CPUs by using the number of local APIC that are disabled in the ACPI MADT table. In an implementation, you should only use this method to get the number of CPUs, and a apicid cannot rely on the values in the table above that describe the number of disabled APIC. These hot-swappable CPUs cannot be set as unavailable in the BIOS. The parameter "Additional_cpus=x" can be used to describe a hot-swappable CPU in the Cpu_possible_map.

Possible_cpus=n

[s390, x86_64] uses this option to set up a hot-swappable CPU. This option sets the possible_cpu corresponding bit in the Cpu_possible_map. Therefore, even if the system reboots, you should ensure that the number of bits in this bitmap is constant.

CPU bitmaps and some related issues

For more information on cpumaps and operational primitives, please pay attention to include/linux/cpumask.h, which contains more information. ]

Cpu_possible_map: A bitmap of the available CPUs in the system (the CPU does not necessarily exist on the system, it also includes the CPU to be inserted). This bitmap is used when allocating boot-time memory for each CPU variable, and the memory for these variables does not expand and release when the CPU is plugged in or overflowed. Once the settings for this bitmap have been completed at the probe stage at startup [Discovery phase], it is static throughout the system execution, meaning that no one is required to be set or cleared at any time. If this bitmap is strictly compliant with the system, you can save some startup memory [boot time memory]. Below we will explain in detail how the X86_64 platform examines this variable.

Cpu_online_map: A bitmap of the CPU currently in use. The __CPU_UP () function can set the appropriate bits in this bitmap when a CPU is already available for kernel scheduling and can receive device interrupts. When a CPU is disabled using the __cpu_disable () function, the corresponding bits in this bitmap need to be cleared before all system services, including interrupts, are migrated to other CPUs.

Cpu_present_map: A bitmap of the CPUs currently present in the system. They are not necessarily in [online]. When the physical hot plug operation is processed by the relevant subsystem (such as ACPI), the modified bitmap needs to be modified accordingly according to the hot-swappable condition. There is no lock rule yet. The typical application of this bitmap is to initialize the topology at startup, and hot plug is disabled at this time.

You really don't need any CPU bitmaps in the operating system. In most cases, they should all be read-only. When you set a per-CPU variable, you almost always use CPU_POSSIBLE_MAP/FOR_EACH_POSSIBLE_CPU () to loop.

Only cpumask_t can be used to describe a CPU bitmap.

#include <linux/cpumask.h>

FOR_EACH_POSSIBLE_CPU-Traversal Cpu_possible_map

FOR_EACH_ONLINE_CPU-Traversal Cpu_online_map

FOR_EACH_PRESENT_CPU-Traversal Cpu_present_map

For_each_cpu_mask (X,mask)-traverses the bitmap that describes the CPU collection: Mask.

#include <linux/cpu.h>

Get_online_cpus () and Put_online_cpus (): [These two functions are actually Ka to the Cpu_online_map. ]

The above function can constrain [INHIBIT]CPU hot-swappable operations. These two functions are actually manipulating the cpu_hotplug.refcount. When Cpu_hotplug.refcount is not 0 o'clock, Cpu_online_map can not be changed. If you only need to avoid the CPU from being disabled, you can also use Preempt_disable ()/preempt_enable () before and after the critical section. It should be noted, however, that there is no function in the critical section that can cause sleep or schedule this process to go. As soon as the function used to shut down the processor Stop_machine_run () is invoked, preempt_disable () executes.

CPU Hot-swappable FAQ

Q: How to enable my kernel to support processor hot-swappable.

A: Enable CPU hot-swappable support when make Defconfig:

"Processor type and Features"-> Support for Hotpluggable CPUs

You also need to open the Config_hotplug and CONFIG_SMP options. If you need to support SMP hangs/restores, you also need to turn on the CONFIG_HOTPLUG_CPU option.

Q: Which architecture supports CPU hot-swappable.

A: In the 2.6.14 kernel, the following architecture supports CPU hot swap.

i386 (Intel), PPC, PPC64, Parisc, s390, IA64 and x86_64

Q: If you are testing whether a newly compiled kernel supports hot-swappable.

A: Please note a file in the/sys.

First use the Mount command to determine if the SYSFS is mounted. Note whether the following statements are in the output.

....

None On/sys type SYSFS (rw)

....

This indicates that/sys has not yet been mounted, please do the following.

#mkdir/sysfs

#mount-T Sysfs Sys/sys

Now you can see the folders that correspond to the CPUs that already exist in all systems, and here is an example of a 8-way system.

#pwd

#/sys/devices/system/cpu

#ls-L

Total 0

Drwxr-xr-x Root 0 Sep 19 07:44.

Drwxr-xr-x Root 0 Sep 19 07:45.

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu0

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu1

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu2

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu3

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu4

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu5

Drwxr-xr-x 3 Root 0 Sep 07:44 cpu6

Drwxr-xr-x 3 Root 0 Sep 07:48 cpu7

In each folder, there is a file named "online," which is a control file. I like the word.], which can be used to enable/disable [Online/offline] a processor.

Q: Whether the hot-plug/Gebert corresponds to the physical addition/removal of the processor.

A: The use of hot plug/Gebert is not exactly the same as its literal meaning. CONFIG_HOTPLUG_CPU enables the kernel to be logically enabled and disabled. To support physical additions/removal, some BIOS callback functions are required, and the platform has some mechanisms similar to the PCI hot-swappable buttons. CONFIG_ACPI_HOTPLUG_CPU enables ACPI to support the physical addition/removal of CPUs.

Q: How do I disable a CPU logically?

A: Perform the following actions.

#echo 0 >/sys/devices/system/cpu/cpux/online

If the logical disable is successful, check the

#cat/proc/interrupts

In this file, you will not see the columns that are being removed for the CPU. When the CPU is removed, its online file is 0, otherwise it is 1.

#To display the current CPU state.

#cat/sys/devices/system/cpu/cpux/online

Q: Why CPU0 cannot be removed in some systems.

A: Some architectures are particularly dependent on a particular CPU.

For example, on the IA64 platform, we were able to send platform interrupts to the OS. That is, platform error correction interrupt [corrected Platform error interrupts (Cpei)]. If the ACPI configuration is correct, we do not have a hair change target CPU. Therefore, if the current ACPI version does not support such redirection, such a CPU is not removable. [In fact, some interrupts can only be sent to a specific CPU.] ]

In this case, you will find that CPU0 does not have a online file.

Q: If a special CPU cannot be removed, how can I find it.

A: This relies on specific implementations, and in some architectures we find the "online" files for these CPUs. This is a good idea if we can know in advance that this process cannot be removed.

In some cases, you can check at run time, that is, if you want to remove the last CPU, this is not allowed. At this point, the echo command gives an error message.

Q: What happens when a CPU is logically removed.

A: The following things will happen, the arrangement is unordered J.

-the module in the kernel receives a notification [notification], the corresponding event is Cpu_down_prepare or Cpu_down_prepare_frozen, which event depends on whether the task is "Frozen" when the CPU is removed, and the task The reason for being frozen is that a pending operation is being performed.

-All processes on this CPU are migrated to the new CPU. New

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.