Linux kernel How to deal with big-endian small end byte sequence

Source: Internet
Author: User

Recently in doing the kernel by the small end processor (arm) to the big-endian processor (PPC) of the work, now kernel into the console stable work, the basic work has been completed, but there are many ideas in the transplant or need to summarize, Today, we summarize the kernel for the processing of the size-end byte order.


Previously written on the size of the end byte sequence thinking, article link address: http://blog.csdn.net/skyflying2012/article/details/42065427.

According to the previous understanding, the byte order can be considered as a processor subjective concept, just like how people look at things, the processor is divided into big and small end, for the memory read and write, as long as the data type consistent, there is no problem of byte order.

So I feel that the biggest difference in byte order is the read and write of registers. Because peripheral registers are small-ended (according to the kernel code, the following are explained in detail below)

According to my previous byte-order thinking article, there are 2 scenarios for register read-write differences:

(1) To solve this problem from the hardware, for 32-bit CPU, the 32 data bus is reversed, but this can be a problem for addressing less than 32 bits of data, and not all modules can be reversed (such as memory), which also involves a compiler problem.

(2) from the software to solve this problem, in the bottom read and write register function, the read/write data to swap.

As a software person, I am most concerned about the feasibility of the second scenario, because the data in the read and write registers to swap, increase the complexity of register read and write, the original storage/loading instructions can be completed work, now may need to add some more swap-related instructions, can not guarantee the atomicity of the register operation. For high-performance, large concurrent systems, may result in race state.

Therefore , the data swap and the R/W register are completed with the fewest instructions to ensure the normal and stable operation of the Linux system.

In the transplant bootloader, I was moving the data to complete the swap, because there was no race problem because of the bootloader single process.


This is a concern in the kernel transplant, but it is found that the general function of the size-end processor Operation Register is provided under Kernel, which is the Readl/writel (taking the action 32-bit register as an example).

For driver developers do not need to care about the processor byte order, register operation directly using Readl/writel.

There are many articles on the Internet that refer to Readl/writel, but do not specifically analyze its implementation.

Today, the main analysis of how to achieve efficient data swap and register read and write Readl/writel. We take READL as an example, how to handle the register data for the Big-endian processor.

Kernel under READL defined below, in include/asm-generic/io.h

#define READL (addr) __le32_to_cpu (__raw_readl (addr))
__raw_readl is the lowest-level register read-write function, very simple, from the direct acquisition of register data. To see the implementation of __LE32_TO_CPU, this function has different implementations for the byte order, for the small-end processor, in./include/linux/byteorder/little_endian.h, as follows:

#define __LE32_TO_CPU (x) ((__force __u32) (__LE32) (x))
It is equivalent to doing nothing. For big-endian processors, in./include/linux/byteorder/big_endian.h, as follows:

#define __LE32_TO_CPU (x) __swab32 ((__force __u32) (__LE32) (x))
the literal meaning can also be seen, __swab32 realize data rollover. Wait, we'll analyze the implementation of __SWAB32, the essence is in this function.

But before this first consider a problem, for different CPUs, such as Arm MIPS PPC, how to choose to use Little_endian.h or big_endian.h it.

The answer is, for different processor platforms, there are arch/xxx/include/asm/byteorder.h header files, to see what arm MIPS PPC Byteorder.h respectively.

Arch/arm/include/asm/byteorder.h

*  arch/arm/include/asm/byteorder.h * * Arm endian-ness.  In little endian mode, the data bus was connected such * that byte accesses appear as: *  0 = d0...d7, 1 = d8...d15, 2 =  D16...D23, 3 = d24...d31 * and Word accesses (data or instruction) appear as: *  d0...d31 * * When in big endian mode, Byte accesses appear as: *  0 = d24...d31, 1 = d16...d23, 2 = d8...d15, 3 = D0...d7 * and Word accesses (data or InStr uction) appear as: *  d0...d31 */#ifndef __asm_arm_byteorder_h#define __asm_arm_byteorder_h#ifdef __armeb__# Include <linux/byteorder/big_endian.h> #else # include <linux/byteorder/little_endian.h> #endif #endif


Arch/mips/include/asm/byteorder.h

/* * This file was subject to the terms and conditions of the GNU general public * License.  See the file "COPYING" in the main directory of this archive * for more details. * Copyright (C) 1996, 2003 by Ralf Baechle */#ifndef _asm_byteorder_h#define _asm_byteorder_h#if defined (__mipseb__) #include <linux/byteorder/big_endian.h> #elif defined (__mipsel__) #include <linux/byteorder/little_ Endian.h> #else # error "MIPS, but neither __mipseb__, nor __mipsel__???" #endif #endif/* _asm_byteorder_h */


Arch/powerpc/include/asm/byteorder.h

#ifndef _asm_powerpc_byteorder_h#define _asm_powerpc_byteorder_h/* * This program was free software; You can redistribute it and/or * modify it under the terms of the GNU general public License * as published by the Ftware Foundation; Either version * 2 of the License, or (at your option) any later version. */#include <linux/byteorder/big_endian.h> #endif/* _asm_powerpc_byteorder_h */

It can be seen that arm MIPS is supported under the kernel, and arm MIPS can choose the processor byte-order. PPC only supports Big-endian. (In fact PPC also supports the selection of byte order)

With Byteorder.h, we don't need to care about the byte order of the processor when we write driver, we just need to include byteorder.h.

Next, take a look at the most critical __swab32 function, as follows:

In the Include/linux/swab.h

/** * __swab32-return a byteswapped 32-bit value * @x:value to Byteswap */#define __SWAB32 (x)                 (__builtin_constant_p ((__U32) (x))?     ___CONSTANT_SWAB32 (x):             __fswab32 (x))

Macro definition expansion, is a conditional judge.

__builtin_constant_p is a GCC built-in function that determines whether a value is constant at compile time, and if the argument is a constant, the function returns 1, otherwise returns 0.
If the data is constant, __CONSTANT_SWAB32 is implemented as follows:

#define ___CONSTANT_SWAB32 (x) ((__U32) (((                 __u32) (X & (__U32) 0x000000fful) <<) |            ((__U32) (x) & (__U32) 0x0000ff00ul) <<  8) |            ((__U32) (x) & (__U32) 0x00ff0000ul) >>  8) |            (((__U32) (x) & (__U32) 0xff000000ul) >> 24)))
for constant data, the normal displacement is used and then the stitching method, for constants, such consumption is necessary (this is kernel explanation, not very understanding)

If the data is a run-time calculated data, use __FSWAB32, which is implemented as follows:

Static inline __attribute_const__ __u32 __fswab32 (__u32 val) {#ifdef __arch_swab32    return __arch_swab32 (val); #else    return ___constant_swab32 (val); #endif}
if __ARCH_SWAB32 is not defined, the data is still flipped with the __constant_swab32 method, but arm MIPS PPC defines the __ARCH_SWAB32 of the respective platform to achieve an efficient swap for its own platform , respectively, defined as follows:

Arch/arm/include/asm/swab.h

Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x) {    __asm__ ("Rev%0,%1": "=r" (x): "R" (x));    return x;}


Arch/mips/include/asm/swab.h

Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x) {    __asm__ (    "   wsbh    %0,%1          \ n"    "   ROTR    %0,%0, +      \ "    :" =r "(x)    :" R "(x));    return x;}

Arch/powerpc/include/asm/swab.h

Static inline __attribute_const__ __u32 __arch_swab32 (__u32 value) {    __u32 result;    __asm__ ("Rlwimi%0,%1,24,16,23\n\t"        "Rlwimi%0,%1,8,8,15\n\t"        "Rlwimi%0,%1,24,0,7"        : "=r" (Result)        : "R" (value), "0" (value >>));    return result;}

As you can see,arm uses 1 instructions (Rev Data rollover Instructions), MIPS uses 2 instructions (WSBH ROTR data exchange instructions), PPC uses 3 instructions (Rlwimi Data displacement instruction), to complete the change of the data of the three bit. This is more efficient than the common method of displacement stitching!

In fact, from the function of the name __fswab can also be seen to achieve fast swap.

In turn, we think that the kernel for the small-end processor registers to read and write data without any processing, but for the big-endian processor is swap, which also shows that the peripheral register data arrangement is small-endian.












Linux kernel How to deal with big-endian small end byte sequence

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.