Device namespace Introduction-appliance virtualization based on kernel namespace

Source: Internet
Author: User

On mobile devices, the need for virtualization is growing. One is that mobile device configuration is getting higher, and some high-end configurations are close to desktop devices, which lays the groundwork for virtualization and, second, a growing diversity of user scenarios for mobile devices. Mobile devices are now not only used for entertainment, but also for work; third, security and privacy issues are increasingly prominent. There are more privacy information on mobile devices, such as various accounts, payment passwords, and so on, and a variety of virus Trojans are spreading rapidly to mobile devices. In this context it is safer to run sensitive software in an isolated environment, with the advent of multi-user requirements. Sometimes mobile phones, especially tablet users, are multiple, such as when playing for a child and want to be in a specific restricted operating environment.

Virtualization technology in desktop Systems has matured, vendors have provided hardware support, and various virtualization solutions have been used extensively. On the one hand, on the other hand, because of the relatively limited computing power, mobile processor support for virtualization is not as mature as the desktop system. Therefore, container technology based on kernel namespace has become the main research hotspot because of its lightweight. And the mobile platform is a major feature of a variety of hardware devices, and kernel namespace before the main server, server peripherals, regardless of the type or quantity are small, the display can be omitted. All kinds of sensor, camera, WiFi, audio, display, radio, input and LED are numerous in the mobile phone. So based on kernel namespace, one of the big problems is to solve the virtualization of the device.

In this context, device namespace is presented in the Research Project Cell (http://systems.cs.columbia.edu/projects/cells/) of Columbia University, Used to implement concurrent use of devices on multiple Android systems on the same phone. It is essentially an extension of the kernel namespace. In this scenario, virtualization in Kernel-level can be divided into three ways: 1) device driver wrapper, such as framebuffer. 2) Namespace-aware subsystem, such as input. 3) Namespace-aware device driver, such as binder. In addition, for some closed-source modules, such as Ril and WiFi, you need to use user-configured device namespace. In short, the entire system starts with a trusted minimum initialization environment called root namespace. Similar to the dom0 in Xen, it is responsible for really managing and switching other virtual systems and actually accessing those closed-source modules. When other virtual systems are accessed, apply to root namespace via IPC.

Subsequently, Cellrox released a patch of device namespace for commercial use license as part of Cellrox's Thinvisor technology (https://github.com/Cellrox/ Devns-patches/wiki/devicenamespace). Patch is divided into three parts: 1) The framework mainly contains device namespace frame, active device namespace switch processing and other open APIs. 2) Traditional virtualization is mainly for isolation. Enables different namespace processes to operate the device independently of each other. Like binder, alarm, logger. 3) Context-aware virtualization is mainly to support the Foreground-background model. such as input, framebuffer, LED, backlight. In this model, the system has multiple virtual systems, but only one is active. Therefore, in general, only active namespace the process of the device will take effect, the other does not work. This is subdivided into statefull and stateless device driver. For the good stateless, as long as the processing to determine whether the active device namespace, not the words ignored. And Statefull, you have to save the virtual state for each device namespace.

The following is an example of the input system to see the basic framework and working principle of device namespace. First, you need to add device namespace based on the existing kernel namespace. Specifically, the DEV_NAMESPACE structure is added to the NSPROXY structure:

struct Nsproxy {    ...    struct dev_namespace *dev_ns;};
The NSPROXY structure is a member of the structure task_struct that represents the process. This means that, for a process, the Dev_ns under its Nsproxy represents the device namespace it resides in. The NSPROXY structure can be shared between processes of the same namespace, but when one of the namespace in this nsproxy is copied or unshared, the nsproxy is copied and becomes private to the process in which it resides. A bit of cow meaning.


The core structure of Dev_namespace is as follows:

struct Dev_namespace {    bool active;    ...    pid_t Init_pid;    ...    struct Blocking_notifier_head notifiers;    ...    struct dev_ns_info *info[dev_ns_desc_max];};
Where active represents whether it is an active device namespace. The init_pid is the PID of the INIT process under the device namespace. Info is an array of dev_ns_info structures, each of which represents a device or subsystem under this device namespace (collectively, the device for simplicity). Notifiers is a notifier_block linked list that links the members of the DEV_NS_INFO structure to NB. It enables each registered device to invoke a handler function through the Linux notifier chains mechanism to allow active device namespace to switch. Dev_ns_info stands for a single device in a single device namespace, which is created if it is not created, when using devices in device namespace.

struct Dev_ns_info {   struct dev_namespace *dev_ns;   struct List_head list;   struct Notifier_block nb;   atomic_t count;};
The list element is used to string the same device namespace the device to represent the structure Dev_ns_desc.

The initial value of Dev_namespace is Init_dev_ns, which is the device namespace that represents the init process. The global variable Active_dev_ns indicates now active device namespace. The default, of course, is Init's device namespace. Dev_ns_desc is a global array in the system, and each element represents a device that needs to be namespace with device.
struct Dev_ns_desc {   char *name;   struct Dev_ns_ops *ops;   struct list_head head;}; static struct Dev_ns_desc Dev_ns_desc[dev_ns_desc_max];
The dev_ns_ops defines the interface of the device namespace framework to invoke the specific device driver. It is implemented in device driver. Each specific device is initialized with an entry in the Dev_ns_desc. Include the dev_ns_info structure in the driver-specific Xxx_dev_ns (such as the EVDEV_DEV_NS) structure. This links common's Dev_ns_desc and specific device driver. In addition, the framework of device namespace defines a series of helper functions for driver using Define_dev_ns_info.
#define DEFINE_DEV_NS_INFO (x)      _dev_ns_id (x)          _dev_ns_find (x)          _dev_ns_get (x)          _dev_ns_get_cur (x)           _dev_ns_put (X)  
This macro, such as define_dev_ns_info (alarm), needs to be defined in each kernel module that requires device NAMESPCE. Taking the Evdev module as an example, Define_dev_ns_info (Evdev) generates the following:
static int evdev_ns_id;     Static inline struct Evdev_dev_ns *get_evdev_ns (struct dev_namespace *dev_ns) {struct Dev_ns_info *info;     info = Get_dev_ns_info (evdev_ns_id, Dev_ns, 1, 1); return info? Container_of (info, struct Evdev_dev_ns, dev_ns_info): NULL;}     Static inline struct Evdev_dev_ns *find_evdev_ns (struct dev_namespace *dev_ns) {struct Dev_ns_info *info;     info = Get_dev_ns_info (evdev_ns_id, Dev_ns, 0, 0); return info? Container_of (info, struct Evdev_dev_ns, dev_ns_info): NULL;}     Static inline struct Evdev_dev_ns *get_evdev_ns_cur (void) {struct dev_ns_info *info;     info = get_dev_ns_info_task (evdev_ns_id, current); return info? Container_of (info, struct Evdev_ns, dev_ns_info): NULL;} static inline void Put_evdev_ns (struct Evdev_dev_ns *evdev_ns) {put_dev_ns_info (evdev_ns_id, &evdev_ns->dev_ns _info, 1);} 
Where evdev_ns_id is the index of the device element in the DEV_NS_DESC array and the info array of the dev_namespace. It is assigned in the Evdev_init () function when the Evdev is initialized.
   ret = Dev_ns_register (Evdev, "event DEV");   if (Ret < 0) {       input_unregister_handler (&evdev_handler);       return ret;   }
Essentially call Register_dev_ns_ops () to register a new device in Dev_ns_desc and initialize it. An important step in initializing is to register the module-related structure evdev_ns_ops in the Dev_ns_desc. The implementation of this interface is in Evdev for the framework callback Evdev subsystem that will let device namespace later.
static struct Dev_ns_ops Evdev_ns_ops = {   . Create = Evdev_devns_create,   . Release = Evdev_devns_release,};
As mentioned earlier, an element in Dev_ns_desc represents a device. Equivalent to the Global registry of the device. The registration process here is a linear search for the first empty position, which returns the index of this position as evdev_ns_id. Here, the device is not actually being used, so the corresponding dev_ns_info structure is not created, so the linked list in the element head is empty.

Then, one day, one of the processes in the system opened a device in the Evdev subsystem, and then Evdev_open (), evdev_ns_track_client (client), was called.
static int evdev_ns_track_client (struct evdev_client *client) {   struct Evdev_dev_ns *evdev_ns;   Evdev_ns = Get_evdev_ns_cur ();   ...   Client->evdev_ns = Evdev_ns;   ...   List_add (&client->list, &evdev_ns->clients);   ...}
Create a EVDEV_DEV_NS structure in this function. As mentioned earlier, each device that needs to use devices namespace has to define this xxx_dev_ns structure. It is a bridge between the Deivce driver and the device namespace framework. The DEV_NS_INFO structure is included in the Evdev_dev_ns. Each time the Evdev device is opened, a Evdev_client object is created. All evdev_client under the same device namespace are strung into a member Evdev_dev_ns that represents the structure clients of EVDEV devices in the device namespace.

The above get_evdev_ns_cur () calls Get_dev_ns_info_task (), Get_dev_ns_info () in turn. This function checks to see if the device is already registered in the device namespace where the machine process is opened. Some words will return directly, otherwise call New_dev_ns_info () new. But this dev_ns_info structure is wrapped in a driver-specific xxx_dev_ns structure. So to invoke the previously registered callback to initialize the outside structure first, here is Evdev_devns_create (). After initializing the Evdev_dev_ns and then returning the dev_ns_info inside it, string to the Dev_ns_desc array representing the device. Take a look at Evdev_dev_ns_create (), which creates a Driver-specific device namespace structure Evdev_dev_ns. Then register the notifier function, which is called back when switching active device namespace.
DEV_NS_INFO->NB = Evdev_ns_switch_notifier;dev_ns_register_notify (Dev_ns, &dev_ns_info->ns);
The structure of this notifier chain is as follows:

Considering the above data structure, here is a general diagram showing the approximate relationship between them. In this example, there are two device namespace, which consider the two devices Evdev and alarm. Evdev is used in two device namespace, one of which has two clients in one device namespace. The alarm is used only in one device namespace.



Then, Set_active_dev_ns () is called when active device namespace switches. Active device namespace is switched through/proc files to notify kernel. Of course, this is only for demo purposes, the real time can be changed to other interfaces. In Dev_namespace_init (), create the/proc/dev_ns/active_ns_pid and/proc/dev_ns/ns_tag. Their file_operations structures are active_ns_fileops and ns_tag_fileops, respectively. Take Active_ns_pid as an example, when it is written, Trigger Proc_active_ns_write (), Dev_ns_proc_write (), Set_active_dev_ns (), It then invokes the previously registered Notifier function. The process here is straightforward, such as a namespace cut to B namespace, the first Dev_ns_event_deactivate event to input driver notify a namespace cut to the background, and then the active device namespace is set to B, and the last Dev_ns_event_activate event to input driver notification B namespace activated.
void Set_active_dev_ns (struct dev_namespace *next_ns) {   ...   (void) Blocking_notifier_call_chain (&prev_ns->notifiers,                       dev_ns_event_deactivate, Prev_ns);   (void) Blocking_notifier_call_chain (&dev_ns_notifiers,                       dev_ns_event_deactivate, Prev_ns);   ...   Next_ns->active = true;   ...   Active_dev_ns = Next_ns;   ...   (void) Blocking_notifier_call_chain (&next_ns->notifiers,                       dev_ns_event_activate, Next_ns);   (void) Blocking_notifier_call_chain (&dev_ns_notifiers,                       dev_ns_event_activate, Next_ns);   ...}
Here Blocking_notifier_call_chain () is essentially called the previously registered Evdev_ns_swtich_callback (), which first finds the corresponding EVDEV_DEV_ based on the current device namespace NS structure. The list of clients members in this EVDEV_DEV_NS lists all the devices namespace the device using the session, each expressed in evdev_client. As described in the previous figure, the members of the device namespace service in Evdev_client are:
   struct Evdev_dev_ns *evdev_ns;   struct List_head list;   BOOL Grab;
Note that the grab here is a virtual state that records whether the session is exclusive to the device when there is no more than one virtual system, which represents the request and does not represent the true grab state, because it also passes the logic of the device namespace. Once the EVDEV_DEV_NS structure is found, the clients member iterates through all open sessions in the device namespace, if the message being processed is dev_ns_event_activate, indicating that the device namespace is cut to the foreground, if the session is previously set to exclusive, call Evdev_grab () to make it truly exclusive. If the dev_ns_event_deactivate is processed, that is, the device namespace to the background, if the current session is a true exclusive session, call Evdev_ungrab () to cancel its grab state. Here, summarize the above process:



With this information, you can do some namespace-aware logic when you read and write the input event. If the device namespace is checked when the event is broadcast, it is ignored if the namespace is not active. Evdev_write () is similar to the write handler function for the device. In addition, in Evdev_do_ioctl (). For non-active device namespace, it is often not handled. In Eviocgrab's IOCTL processing, for a session in inactive's device namespace, the status in the Client->grab is set to the state that should be set, but not really, but as mentioned in the previous section, wait until the active device namespace when switching.

As a mobile platform container scenario, device namespace also has areas that need to be expanded, such as device namespace, which supports multiple active devices, but it provides a viable solution for lightweight device virtualization. Other device virtualization scenarios such as Multi-session (https://dvdhrm.wordpress.com/2013/08/25/sane-session-switching/) in Systemd, It can be exempt from changes to kernel. Specific use can be combined with multiple scenarios on demand and device type.


Device namespace Introduction-appliance virtualization based on kernel namespace

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.