Introduce the basic usage and basic grammar of inline assembly to beginners through two scenes

Source: Internet
Author: User
Keywords Beginners basic usage basic grammar inline assembly
Tags advanced applied basic basic grammar basic usage beginners can find code

For c++/c++ programmers, inline assembler is not a new feature that can help us make the most of our computing power. However, most programmers rarely have the opportunity to actually use this feature. In fact, inline compilations serve only specific requirements, especially when it comes to advanced high-level programming languages.

This article describes two scenarios for the IBM Power processor architecture. Using the examples provided in this article, we can find out where the inline assembler is applied.

Scenario 1: Better libraries

C + + programming languages support logical operations. In this case, the http://www.aliyun.com/zixun/aggregation/6579.html "> user use bit as the base unit." The user has written an algorithm to calculate the number of digits occupied by a 32-bit variable.

Code A: Calculate the number of digits occupied

inline int bit_taken (int data) {taken = 0;04 while (data) {1]; taken++;07}08 return taken;0 3 ·

This code shows how to work with loops and shift operations. If the user compiles code with the highest level of optimization (-O3 applies to GCC,-O5 for XLC), the user may find that some optimizations (such as expansion, constant data propagation, and so on) are automatically completed and can generate the fastest code in the world. But the basic idea of the algorithm has not changed.

Description of the list A:CNTLZW

CNTLZW (Count leading zeros Word) directive

Objective

In the future, the number of leading zeros of the source universal registers is put into a universal register.

The CNTLZW instruction can get the number of leading zeros. We take the number 15 as an example, the binary is represented as 0000, 0000, 0000, 0000, 0000, 0000, 0000, and 1111,CNTLZW will tell you that there are 28 leading zeros in total. After a rethink, the user decides to simplify its algorithm, as shown in code B.

Code B: Calculate the number of digits occupied by the inline assembly

#ifdef __arch_ppc__02 Inline int bit_taken (int data) {int taken;05 asm ("CNTLZW%0,%1\n\t": "=b" (Taken) Modified: "B" ( data); sizeof (data) * 8–taken;10}11 #else ... #endif

Macros with name __arch_ppc__ only wrap new code that applies to the PowerPC schema. Compared to code A, the new code has removed all loops or shifts. Then, the user may be pleased to see the performance of the Bit_taken improved. It runs faster on PowerPC. Also, application-bound Bit_taken even perform better.

This story does not only show that the user can improve his algorithm with rich instructions, but also that inline assembly is the best helper to improve performance. By embedding assembly code into C + +, you can minimize the effort of users to modify code.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.