Arm Eabi (English description/Chinese description)

Source: Internet
Author: User

Why arm's Eabi matters

By Andres Calderon and Nelson Castillo

It's common nowadays to hear of the new arm Eabi (Embedded Application binary Interface) Linux port. there are using motivations to start using it, but there is one we especially like -- It's much faster for floating point operations. since extends arm cores lack a hardware FPU (floating point unit), any software acceleration is more than welcome.

It might be hard to switch to Eabi, though. For instance, for the Debian distribution, Eabi is actually considered a new port.

Without Eabi

The arm Eabi improves the floating point performance. This is not surprising, if you read how your processor is wasting a lot of cycles now. From the Debian
ARM-EABI

 
Wiki:

The current Debian port creates hardfloat FPA instructions. FPA comes from "floating point accelerator. "since the FPA floating point unit was implemented only in very few arm cores, these days FPA instructions are emulated in kernel via illegal instruction faults. this is of course very inefficient: about 10 times slower that-msoftfloat for a FIR test program. the FPA Unit also has the peculiarity of having mixed-Endian doubles, which is usually the biggest grief for ARM porters, along with structure packing issues.

So, what does this mean? It means that the compilers usually generate instructions for a piece of hardware, namely a floating point unit, that is not actually there! When you make a floating point operation, such at 3.58 * X, the CPU runs into an illegal instruction, and it raises an exception. the kernel catches this specific exception and performs the intended float point operation, and then resumes executing the program. and this is slow because it implies a context switch.

The benchmark

We decided to make a simple benchmark using our open hardware free
Ecb_at91

Arm (armv4t) Development Board, based on an Atmel
At91rm9200

 
Processor.






The ecb_at91, top and bottom

(Click each image to enlarge)
We used a simple benchmark we have used before: the dot product of two given vectors, the Euclidean distance of the vectors, and the FFT (Fast Fourier Transform) algorithm (complex valued, cooley and Tukey Radix-2 ). the source code we used is available
Here

 
(GPL ).

It's common to use the number of floating point operations per second (FLOPS) generated med by a given program for benchmarking purposes. however, this can be misleading, because some operations (e.g. division) take more time than others (e.g. addition ). to ensure uniformity, we ran the same program in both setups, with similar compiler flags.

First we tried the old Abi using the Debian distribution (Debian Sid), and an image that webootstrapped

. Then, for the Eabi test, we used
Angstrom Distribution

, Part of theopenembedded

 
Project.

Results




Eabi VX. oabi, floating point benchmark (free_ecp_at91_v1.5, at92rm9200)

(Click to enlarge. Source:
Emqbit

)




Eabi/oabi speed-up, floating point benchmark (free_ecb_at91_v1.5, at92rm9200)

(Click to enlarge)

In each context switch, both the data and Instruction Cache are flushed, and this hurts the old Abi's performance. you will notice it in the graphs because the performance with the old Abi does not depend on the size (n) of the input data, whereas in Eabi the impact of the cache in the performance is seen clearly. the dot-product performance only goes down when n> 4096 (when we use more than 16kb in memory); The Atmel processor we're using has a 16 Kbyte data cache.

 

Brief Chinese description:

 

When compiling source code, the cross-compiler uses FPA (float point architecture, that is, hard floating point) for floating point operations by default. For CPUs without FPA, such as Samsung's S3C2440, the execution speed after FPE (float point emulation, that is, soft floating point) compilation is greatly limited, and the above processing is improved using Eabi (Embedded Application binary interface. Because arm Eabi uses vector float point (vector floating point), it greatly improves the performance of the program designed for floating point operations.

The latest Linux system (2.6.29) adopts a unified cross compiler that complies with the Eabi standard and uses the glibc 2.8 library.

For users who use Linux cross compilers, there is no need to worry about switching between different compilers.

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.