How Linux kernel handles the size side

Last Update:2016-06-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For the time being with MPC8309, it is not clear when the size end core is transferred.

I saw the article about READL and Writel's concrete implementation today.

Today, the main analysis of how to achieve efficient data swap and register read and write Readl/writel. We take READL as an example, how to handle the register data for the Big-endian processor.

Kernel under READL defined below, in include/asm-generic/io.h

#define READW (addr) __le32_to_cpu (__RAW_READW (addr))

__RAW_READL is the lowest-level register read-write function, very simple, from the direct acquisition of register data. To see the implementation of __LE32_TO_CPU, this function has different implementations for the byte order, for the small-end processor, in./include/linux/byteorder/little_endian.h, as follows:

#define __LE32_TO_CPU (x) ((__force __u32) (__LE32) (x))

It is equivalent to doing nothing. For big-endian processors, in./include/linux/byteorder/big_endian.h, as follows:

#define __LE32_TO_CPU (x) __swab32 ((__force __u32) (__LE32) (x))

The literal meaning can also be seen, __swab32 realize data rollover. Wait, we'll analyze the implementation of __SWAB32, the essence is in this function.

But before this first consider a problem, for different CPUs, such as Arm MIPS PPC, how to choose to use Little_endian.h or big_endian.h it.

The answer is, for different processor platforms, there are arch/xxx/include/asm/byteorder.h header files, to see what arm MIPS PPC Byteorder.h respectively.

Arch/arm/include/asm/byteorder.h

* Arch/arm/include/asm/byteorder.h
*
* ARM endian-ness. In little endian mode, the data bus is connected such
* That byte accesses appear as:
* 0 = d0...d7, 1 = d8...d15, 2 = d16...d23, 3 = d24...d31
* and Word accesses (data or instruction) appear as:
* D0...D31
*
* When in big endian mode, byte accesses appear as:
* 0 = d24...d31, 1 = d16...d23, 2 = d8...d15, 3 = D0...d7
* and Word accesses (data or instruction) appear as:
* D0...D31
*/
#ifndef __asm_arm_byteorder_h
#define __asm_arm_byteorder_h
#ifdef __armeb__
#include <linux/byteorder/big_endian.h>
#else
#include <linux/byteorder/little_endian.h>
#endif
#endif

Arch/mips/include/asm/byteorder.h

/*
* This file was subject to the terms and conditions of the GNU general public
* License. See the file "COPYING" in the main directory of this archive
* For more details.
*
* Copyright (C) 1996, 2003 by Ralf Baechle
*/
#ifndef _asm_byteorder_h
#define _asm_byteorder_h
#if defined (__mipseb__)
#include <linux/byteorder/big_endian.h>
#elif defined (__mipsel__)
#include <linux/byteorder/little_endian.h>
#else
# error "MIPS, but neither __mipseb__, nor __mipsel__???"
#endif
#endif/* _asm_byteorder_h */

Arch/powerpc/include/asm/byteorder.h

#ifndef _asm_powerpc_byteorder_h
#define _asm_powerpc_byteorder_h
/*
* This program was free software; You can redistribute it and/or
* Modify it under the terms of the GNU general public License
* As published by the Free software Foundation; Either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/byteorder/big_endian.h>
#endif/* _asm_powerpc_byteorder_h */

It can be seen that arm MIPS is supported under the kernel, and arm MIPS can choose the processor byte-order. PPC only supports Big-endian. (In fact PPC also supports the selection of byte order)

The byteorder.h of each processor platform will littlie_endian.h/big_endian.h another layer, and we do not need to care about the byte order of the processor when writing the driver, we need only include byteorder.h.

Next, take a look at the most critical __swab32 function, as follows:

In the Include/linux/swab.h

/**
* __swab32-return a byteswapped 32-bit value
* @x:value to Byteswap
*/
#define __SWAB32 (x) \
(__builtin_constant_p (__U32) (x))?
___CONSTANT_SWAB32 (x): \
__FSWAB32 (x))

Macro definition expansion, is a conditional judge.

__builtin_constant_p is a GCC built-in function that determines whether a value is constant at compile time, and if the argument is a constant, the function returns 1, otherwise returns 0.
If the data is constant, __CONSTANT_SWAB32 is implemented as follows:

#define ___CONSTANT_SWAB32 (x) ((__u32) (\
((__U32) (x) & (__U32) 0x000000fful) << 24) | \
((__U32) (x) & (__U32) 0x0000ff00ul) << 8) | \
((__U32) (x) & (__U32) 0x00ff0000ul) >> 8) | \
(((__U32) (x) & (__U32) 0xff000000ul) >> 24)))

For constant data, the normal displacement is used and then the stitching method, for constants, such consumption is necessary (this is kernel explanation, not very understanding)

If the data is a run-time calculated data, use __FSWAB32, which is implemented as follows:

Static inline __attribute_const__ __u32 __fswab32 (__u32 val)
{
#ifdef __ARCH_SWAB32
Return __arch_swab32 (Val);
#else
Return ___constant_swab32 (Val);
#endif
}

If __ARCH_SWAB32 is not defined, the data is still flipped with the __constant_swab32 method, but arm MIPS PPC defines the __ARCH_SWAB32 of the respective platform to achieve an efficient swap for its own platform, defined as follows:

Arch/arm/include/asm/swab.h

Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x)
{
__ASM__ ("Rev%0,%1": "=r" (x): "R" (x));
return x;
}

Arch/mips/include/asm/swab.h

Static inline __attribute_const__ __u32 __arch_swab32 (__u32 x)
{
__ASM__ (
"Wsbh%0,%1 \ n"
"ROTR%0,%0, + \ n"
: "=r" (x)
: "R" (x));
return x;
}

Arch/powerpc/include/asm/swab.h

static inline __attribute_const__ __u32 __arch_swab32 (__u32 value)
{
__u32 result;
__asm__ ("rlwimi %0,% 1,24,16,23\n\t "&NBSP;&NBSP;
" rlwimi %0,%1,8,8,15\ N\t "&NBSP;&NBSP;
" rlwimi %0,%1,24,0,7 "&NBSP;&NBSP;
: " =r " (Result)
: "R" (value), "0" ( value >> 24));
return result;
}

As you can see, ARM uses 1 instructions (Rev Data rollover Instructions), MIPS uses 2 instructions (WSBH ROTR data exchange instructions), PPC uses 3 instructions (Rlwimi Data displacement instruction), to complete the change of the data of the three bit. This is more efficient than the common method of displacement stitching!

In fact, from the function of the name __fswab can also be seen to achieve fast swap.

How Linux kernel handles the size side

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How Linux kernel handles the size side

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

How Linux kernel handles the size side

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support