How Linux kernel handles the size side

Source: Internet
Author: User

For the time being with MPC8309, it is not clear when the size end core is transferred.

I saw the article about READL and Writel's concrete implementation today.

Today, the main analysis of how to achieve efficient data swap and register read and write Readl/writel. We take READL as an example, how to handle the register data for the Big-endian processor.

Kernel under READL defined below, in include/asm-generic/io.h

#define READW (addr) __le32_to_cpu (__RAW_READW (addr))

__RAW_READL is the lowest-level register read-write function, very simple, from the direct acquisition of register data. To see the implementation of __LE32_TO_CPU, this function has different implementations for the byte order, for the small-end processor, in./include/linux/byteorder/little_endian.h, as follows:

#define __LE32_TO_CPU (x) ((__force __u32) (__LE32) (x))

It is equivalent to doing nothing. For big-endian processors, in./include/linux/byteorder/big_endian.h, as follows:

#define __LE32_TO_CPU (x) __swab32 ((__force __u32) (__LE32) (x))

The literal meaning can also be seen, __swab32 realize data rollover. Wait, we'll analyze the implementation of __SWAB32, the essence is in this function.

But before this first consider a problem, for different CPUs, such as Arm MIPS PPC, how to choose to use Little_endian.h or big_endian.h it.

The answer is, for different processor platforms, there are arch/xxx/include/asm/byteorder.h header files, to see what arm MIPS PPC Byteorder.h respectively.

Arch/arm/include/asm/byteorder.h

  1. * Arch/arm/include/asm/byteorder.h
  2. *
  3. * ARM endian-ness. In little endian mode, the data bus is connected such
  4. * That byte accesses appear as:
  5. * 0 = d0...d7, 1 = d8...d15, 2 = d16...d23, 3 = d24...d31
  6. * and Word accesses (data or instruction) appear as:
  7. * D0...D31
  8. *
  9. * When in big endian mode, byte accesses appear as:
  10. * 0 = d24...d31, 1 = d16...d23, 2 = d8...d15, 3 = D0...d7
  11. * and Word accesses (data or instruction) appear as:
  12. * D0...D31
  13. */
  14. #ifndef __asm_arm_byteorder_h
  15. #define __asm_arm_byteorder_h
  16. #ifdef __armeb__
  17. #include <linux/byteorder/big_endian.h>
  18. #else
  19. #include <linux/byteorder/little_endian.h>
  20. #endif
  21. #endif

Arch/mips/include/asm/byteorder.h

  1. /*
  2. * This file was subject to the terms and conditions of the GNU general public
  3. * License. See the file "COPYING" in the main directory of this archive
  4. * For more details.
  5. *
  6. * Copyright (C) 1996, 2003 by Ralf Baechle
  7. */
  8. #ifndef _asm_byteorder_h
  9. #define _asm_byteorder_h
  10. #if defined (__mipseb__)
  11. #include <linux/byteorder/big_endian.h>
  12. #elif defined (__mipsel__)
  13. #include <linux/byteorder/little_endian.h>
  14. #else
  15. # error "MIPS, but neither __mipseb__, nor __mipsel__???"
  16. #endif
  17. #endif/* _asm_byteorder_h */

Arch/powerpc/include/asm/byteorder.h

  1. #ifndef _asm_powerpc_byteorder_h
  2. #define _asm_powerpc_byteorder_h
  3. /*
  4. * This program was free software; You can redistribute it and/or
  5. * Modify it under the terms of the GNU general public License
  6. * As published by the Free software Foundation; Either version
  7. * 2 of the License, or (at your option) any later version.
  8. */
  9. #include <linux/byteorder/big_endian.h>
  10. #endif/* _asm_powerpc_byteorder_h */


It can be seen that arm MIPS is supported under the kernel, and arm MIPS can choose the processor byte-order. PPC only supports Big-endian. (In fact PPC also supports the selection of byte order)

The byteorder.h of each processor platform will littlie_endian.h/big_endian.h another layer, and we do not need to care about the byte order of the processor when writing the driver, we need only include byteorder.h.

Next, take a look at the most critical __swab32 function, as follows:

In the Include/linux/swab.h

    1. /**
    2. * __swab32-return a byteswapped 32-bit value
    3. * @x:value to Byteswap
    4. */
    5. #define __SWAB32 (x) \
    6. (__builtin_constant_p (__U32) (x))?
    7. ___CONSTANT_SWAB32 (x): \
    8. __FSWAB32 (x))


Macro definition expansion, is a conditional judge.

__builtin_constant_p is a GCC built-in function that determines whether a value is constant at compile time, and if the argument is a constant, the function returns 1, otherwise returns 0.
If the data is constant, __CONSTANT_SWAB32 is implemented as follows:

    1. #define ___CONSTANT_SWAB32 (x) ((__u32) (\
    2. ((__U32) (x) & (__U32) 0x000000fful) << 24) | \
    3. ((__U32) (x) & (__U32) 0x0000ff00ul) << 8) | \
    4. ((__U32) (x) & (__U32) 0x00ff0000ul) >> 8) | \
    5. (((__U32) (x) & (__U32) 0xff000000ul) >> 24)))

For constant data, the normal displacement is used and then the stitching method, for constants, such consumption is necessary (this is kernel explanation, not very understanding)

If the data is a run-time calculated data, use __FSWAB32, which is implemented as follows:

    1. Static inline __attribute_const__ __u32 __fswab32 (__u32 val)
    2. {
    3. #ifdef __ARCH_SWAB32
    4. Return __arch_swab32 (Val);
    5. #else
    6. Return ___constant_swab32 (Val);
    7. #endif
    8. }

If __ARCH_SWAB32 is not defined, the data is still flipped with the __constant_swab32 method, but arm MIPS PPC defines the __ARCH_SWAB32 of the respective platform to achieve an efficient swap for its own platform, defined as follows:

Arch/arm/include/asm/swab.h

    1. Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x)
    2. {
    3. __ASM__ ("Rev%0,%1": "=r" (x): "R" (x));
    4. return x;
    5. }

Arch/mips/include/asm/swab.h

    1. Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x)
    2. {
    3. __ASM__ (
    4. "Wsbh%0,%1 \ n"
    5. "ROTR%0,%0, + \ n"
    6. : "=r" (x)
    7. : "R" (x));
    8. return x;
    9. }


Arch/powerpc/include/asm/swab.h

    1. static inline __attribute_const__ __u32 __arch_swab32 (__u32  value)   
    2. {  
    3.     __u32 result;   
    4.   
    5.     __asm__ ("rlwimi %0,% 1,24,16,23\n\t "&NBSP;&NBSP;
    6.         " rlwimi %0,%1,8,8,15\ N\t "&NBSP;&NBSP;
    7.         " rlwimi %0,%1,24,0,7 "&NBSP;&NBSP;
    8.         : " =r "  (Result)   
    9.         :  "R"   (value),  "0"   ( value >> 24));   
    10.     return result;  
    11. }  

As you can see, ARM uses 1 instructions (Rev Data rollover Instructions), MIPS uses 2 instructions (WSBH ROTR data exchange instructions), PPC uses 3 instructions (Rlwimi Data displacement instruction), to complete the change of the data of the three bit. This is more efficient than the common method of displacement stitching!

In fact, from the function of the name __fswab can also be seen to achieve fast swap.

How Linux kernel handles the size side

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.