Copy_from_user & copy_to_user

Source: Internet
Author: User

Many interesting functions will be encountered in kernel learning, and many related functions can be pulled along a function. Copy_to_user and copy_from_user are two functions that are frequently encountered during Driver-related program design. Because the kernel space and the user space memory cannot communicate with each other directly, the copy_from_user () function is used to copy the user space to the kernel space. The copy_to_user () function copies the kernel space to the user space. Let's take a closer look at the ins and outs of these two functions.
First, let's take a look at how these two functions are defined in the source code file:

~ /ARCH/i386/lib/usercopy. c


unsigned longcopy_to_user(void __user *to, const void *from, unsigned long n){       might_sleep();       BUG_ON((long) n < 0);       if (access_ok(VERIFY_WRITE, to, n))              n = __copy_to_user(to, from, n);       return n;}EXPORT_SYMBOL(copy_to_user);

From the annotations, we can see that the function is mainly used to copy a piece of data from the kernel space to the user space. Because this function may sleep, it can only be used in the user space. It has the following three parameters,
To target address, which is the address of the user space;
From source address, which is the address of the kernel space;
N the number of bytes of data to be copied.
If the data is successfully copied, zero is returned. Otherwise, the number of data bytes that have not been copied is returned.
The above are some descriptions of the function. Next let's take a look at the internal features of this function:

The parameter to has a _ user limit ~ /Include/Linux/compiler. H has the following definitions:


# define __user     __attribute__((noderef, address_space(1)))

It indicates the address of a user space, that is, the memory directed to the user space. Everyone may be confused about this _ attribute _, but it doesn't matter. Google it.
_ Attribute _ is a feature of the gnu c compiler. It is used to allow developers to use this feature to append an attribute to declared functions or variables to facilitate the compiler to perform error checks, it is actually a kernel checker.

For more information, see:


Http://unixwiz.net/techtips/gnu-c-attributes.html


Next, let's take a look at might_sleep (); it has two implementation versions: debug version and non-debug version:
In the debug version, a prompt is displayed in the function that may cause sleep. If it is executed in an atomic context, stack tracing information is printed, this is implemented through the _ might_sleep (_ file __, _ line _) function, and then the might_resched () function is called for rescheduling.
In non-Debug versions, the might_resched () function is called directly for rescheduling.

The implementation method is as follows ~ In/include/Linux/kernel. h:


#ifdef CONFIG_DEBUG_SPINLOCK_SLEEPvoid __might_sleep(char *file, int line);# define might_sleep() \do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0)#else# define might_sleep() do { might_resched(); } while (0)#endif

Next is a macro that checks the validity of parameters:

Bug_on (long) n <0 );
The implementation is as follows (in ~ /Include/ASM-generic/bug. h ):

It checks the condition and determines whether to print the corresponding prompt information based on the result;


#ifdef CONFIG_BUG#ifndef HAVE_ARCH_BUG#define BUG() do { \    printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __FUNCTION__); \    panic("BUG!"); \} while (0)#endif#ifndef HAVE_ARCH_BUG_ON#define BUG_ON(condition) do { if (unlikely((condition)!=0)) BUG(); } while(0)#endif

Next is a macro.
Access_ OK (verify_write, to, n)

It is used to check whether a pointer to a data block in the user space is valid. if the pointer is valid, a non-zero value is returned. Otherwise, zero is returned. The implementation is as follows (in/include/asm-i386/uaccess. h ):


#define access_ok(type,addr,size) (likely(__range_ok(addr,size) == 0))


The implementation of _ range_ OK (ADDR, size) is achieved through embedded assembly, the content is as follows (in/include/asm-i386/uaccess. h ):


#define __range_ok(addr,size) ({ \    unsigned long flag,sum; \    __chk_user_ptr(addr); \    asm("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" \        :"=&r" (flag), "=r" (sum) \        :"1" (addr),"g" ((int)(size)),"g" (current_thread_info()->addr_limit.seg)); \flag; })

Its functions are as follows:


(u33)addr + (u33)size >= (u33)current->addr_limit.seg

Determines whether the preceding formula is true. If not, the address is valid and zero is returned. Otherwise, a non-zero value is returned.
The next function is the most important function, which implements the copy operation:
_ Copy_to_user (to, from, n)

The implementation method is as follows (in/include/asm-i386/uaccess. h ):


static __always_inline unsigned long __must_check__copy_to_user(void __user *to, const void *from, unsigned long n){       might_sleep();       return __copy_to_user_inatomic(to, from, n);}

There is a _ always_inline macro whose content is inline and A _ must_check whose content is in gcc3 and gcc4

__attribute__((warn_unused_result))

Among them, the comments of might_sleep are the same as those of _ User.

The final call is _ copy_to_user_inatomic (to, from, n) to complete the copy work, the implementation of this function is as follows (in/include/asm-i386/uaccess. h ):


static __always_inline unsigned long __must_check__copy_to_user_inatomic(void __user *to, const void *from, unsigned long n){    if (__builtin_constant_p(n)) {        unsigned long ret;        switch (n) {        case 1:            __put_user_size(*(u8 *)from, (u8 __user *)to, 1, ret, 1);            return ret;        case 2:            __put_user_size(*(u16 *)from, (u16 __user *)to, 2, ret, 2);            return ret;        case 4:            __put_user_size(*(u32 *)from, (u32 __user *)to, 4, ret, 4);            return ret;        }    }    return __copy_to_user_ll(to, from, n);}

Where _ builtin_constant_p (n) is the built-in function of GCC, __builtin_constant_p is used to determine whether a value is a compile time. If the value of N is a constant, the function returns 1, otherwise, 0 is returned. Many calculations or operations are more optimized when the parameter is a constant. In gnu c, the above method can be used to compile only the constant version or a very few version based on whether the parameter is a constant, in this way, it is both universal and can compile the optimal code when the parameter is a constant.

If n is a constant of 1, 2 or 4, A swith will be selected to execute the copy action, the copy is implemented through the following function (in/include/asm-i386/uaccess. h ):


#ifdef CONFIG_X86_WP_WORKS_OK#define __put_user_size(x,ptr,size,retval,errret)           /do {                                    /    retval = 0;                         /    __chk_user_ptr(ptr);                        /    switch (size) {                         /    case 1: __put_user_asm(x,ptr,retval,"b","b","iq",errret);break; /    case 2: __put_user_asm(x,ptr,retval,"w","w","ir",errret);break; /    case 4: __put_user_asm(x,ptr,retval,"l","","ir",errret); break; /    case 8: __put_user_u64((__typeof__(*ptr))(x),ptr,retval); break;/    default: __put_user_bad();                /    }                               /} while (0)#else#define __put_user_size(x,ptr,size,retval,errret)           /do {                                    /    __typeof__(*(ptr)) __pus_tmp = x;               /    retval = 0;                         /                                    /    if(unlikely(__copy_to_user_ll(ptr, &__pus_tmp, size) != 0)) /        retval = errret;                    /} while (0)#endif

Where _ put_user_asm is a macro, the copy operation is implemented through the following inline assembly (in/include/asm-i386/uaccess. h ):


#define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)   /    __asm__ __volatile__(                       /        "1: mov"itype" %"rtype"1,%2/n"          /        "2:/n"                          /        ".section .fixup,/"ax/"/n"              /        "3: movl %3,%0/n"                   /        "   jmp 2b/n"                   /        ".previous/n"                       /        ".section __ex_table,/"a/"/n"               /        "   .align 4/n"                 /        "   .long 1b,3b/n"                  /        ".previous"                     /        : "=r"(err)                     /    : ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))

 
These two functions are used to copy small data when copying small bytes of data, such as Char/int.
If n is not the constant mentioned above, the data block region is copied. The implementation is as follows (~ /ARCH/i386/lib/usercopy. C ):


unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned long n){    BUG_ON((long) n < 0);#ifndef CONFIG_X86_WP_WORKS_OK    if (unlikely(boot_cpu_data.wp_works_ok == 0) &&            ((unsigned long )to) < TASK_SIZE) {        /*        * CPU does not honor the WP bit when writing        * from supervisory mode, and due to preemption or SMP,        * the page tables can change at any time.        * Do it manually. Manfred <manfred@colorfullife.com>        */        while (n) {                 unsigned long offset = ((unsigned long)to)%PAGE_SIZE;            unsigned long len = PAGE_SIZE - offset;            int retval;            struct page *pg;            void *maddr;                       if (len > n)                len = n; survive:            down_read(¤t->mm->mmap_sem);            retval = get_user_pages(current, current->mm,                    (unsigned long )to, 1, 1, 0, &pg, NULL);             if (retval == -ENOMEM && current->pid == 1) {                up_read(¤t->mm->mmap_sem);                blk_congestion_wait(WRITE, HZ/50);                goto survive;            }             if (retval != 1) {                up_read(¤t->mm->mmap_sem);                     break;               }             maddr = kmap_atomic(pg, KM_USER0);            memcpy(maddr + offset, from, len);            kunmap_atomic(maddr, KM_USER0);            set_page_dirty_lock(pg);            put_page(pg);            up_read(¤t->mm->mmap_sem);             from += len;            to += len;            n -= len;        }        return n;    }#endif    if (movsl_is_ok(to, from, n))        __copy_user(to, from, n);    else        n = __copy_user_intel(to, from, n);    return n;}EXPORT_SYMBOL(__copy_to_user_ll);

 
The implementation of the copy_from_user function is as follows:


unsigned longcopy_from_user(void *to, const void __user *from, unsigned long n){       might_sleep();       BUG_ON((long) n < 0);       if (access_ok(VERIFY_READ, from, n))              n = __copy_from_user(to, from, n);       else              memset(to, 0, n);       return n;}EXPORT_SYMBOL(copy_from_user);

The implementation method is similar to that of the copy_to_user function: no more details.
The above is how the functions copy_to_user and copy_from_user work. These functions are simple analysis and tracking. The details need to be further studied.
How copy_to_user and MMAP work
Copy_to_user needs to check the validity of the pointer during each copy, that is, the pointer to the address of the user space is indeed the address of the process itself, instead of pointing to the location that does not belong to it, in addition, data is copied once each time, and the memory is frequently accessed. Because the virtual address is continuous, the physical address may not necessarily be consecutive, resulting in frequent cache failures of the CPU, reducing the speed.
MMAP only creates a page table for the process when it is used for the first time, that is, maps a physical address to a virtual address, in future operations, the validity of its address will not be checked (Legal ** due to CPU page protection exceptions). On the other hand, the MMAP address can be operated directly under the inner core without frequent copying, that is to say, you can directly operate on the address using a pointer in the kernel, instead of opening a buffer in the kernel, and then copy the data in the buffer, MMAP maps a consecutive physical address to a virtual address. Of course, you can also map consecutive physical addresses in each segment to a continuous virtual address, in any case, the physical address is continuous in each segment, so that the CPU cache will not become invalid frequently, thus greatly saving time.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.