SIMD (MMX/SSE/avx) variable naming rules

Source: Internet
Author: User

When you use the intrinsics function to operate the SIMD Instruction Set (MMX/SSE/avx, etc.), the SIMD data types of different lengths are displayed, which are divided into multiple compression formats. Therefore, I have designed a set of SIMD variable naming rules to effectively improve code readability.

1. Introduction to SIMD Data Types

SIMD data types include --
_ M64: 64-bit tightening INTEGER (MMX ).
_ M128: 128-bit tightening single precision (SSE ).
_ M128d: 128-bit tightening dual-precision (sse2 ).
_ M128i: 128-bit tightening INTEGER (sse2 ).
_ M256: avx ).
_ M256d: 256-bit compression dual-precision (avx ).
_ M256i: 256-bit tightening INTEGER (avx ).
Note: Compressed integers include 8-bit, 16-bit, 32-bit, 64-bit signed, and unsigned integers.

The correspondence between these data types and registers is --
64-bit mm register (mm0 ~ Mm7) :__ M64.
128-bit SSE register (xmm0 ~ Xmm15) :__ m128, _ m128d, and _ m128i.
256-bit avx register (ymm0 ~ Ymm15) :__ m256, _ m256d, and _ m256i.

Ii. SIMD variable naming rules

Refer to Hungarian notation to add a type prefix before the variable name.
The Type prefix is 3 lower-case letters, the first letter represents the register width, and the last two letters represent the compressed data type.

Register width (first letter )--
M: 64-bit mm register. __ M64
X: 128-bit SSE register. Corresponding to _ m128, _ m128d, and _ m128i.
Y: 256-bit avx register. Corresponding to _ m256, _ m256d, and _ m256i.

Compress data type (two letters )--
MB: 8-bit data. It is used when only the length is known and the specific compression format is unknown. (B: byte)
MW: 16-bit data. (W: Word)
MD: 32-bit data. (D: doubleword)
MQ: 64-bit data. (Q: quadword)
MO: 128-bit data. (O: octaword)
MH: 256-bit data. (H: hexword)
UB: an 8-bit unsigned integer.
UW: A 16-bit unsigned integer.
UD: A 32-bit unsigned integer.
Uq: A 64-bit unsigned integer.
IB: an 8-bit signed integer.
IW: A 16-bit signed integer.
ID: 32-bit signed integer.
IQ: A 64-bit signed integer.
FH: A 16-bit floating point number, that is, a half-precision floating point number. (H: half)
FS: 32-bit floating point number, that is, single-precision floating point number. (S: Single)
FD: 64-bit floating point number, that is, double-precision floating point number. (D: Double)

For example --
Mub: 64-bit compressed byte (64-bit MMX register, which stores 8 8-bit unsigned integers ).
XFS: 128-bit tightening single precision (128-bit SSE register, which stores 4 single-precision floating point numbers ).
Xid: 128-bit Compress With signed characters (the 128-bit SSE register contains four 32-bit signed integers ).
Yfd: 256-bit tightening dual-precision (256-bit avx register, which stores four double-precision floating point numbers ).
Yfh: 256-bit tightening semi-precision (256-bit avx register, which stores 16 semi-precision floating point numbers ).

 

Iii. Sample Code

For example, the SSE accumulative sum program --

Int sum3_intrinsics (int * a, int size) {If (null = A) return 0; If (size <0) return 0; int S = 0; // return value _ m128i xidsum = _ mm_setzero_si128 (); // accumulate. [Sse2] assign the initial value 0 _ m128i xidload; // load int cntblock = size/4; // number of blocks. SSE registers can process four DWORD int cntrem = size & 3; // the remaining number of _ m128i * P = (_ m128i *) A; For (INT I = 0; I <cntblock; ++ I) {xidload = _ mm_load_si128 (p); // [sse2] load xidsum = _ mm_add_epi32 (xidsum, xidload ); // [sse2] signed 32-bit tightening addition + + P;} // process the remaining int * q = (int *) P; For (INT I = 0; I <cntrem; ++ I) S + = Q [I]; // combine the accumulated value with xidsum = _ mm_hadd_epi32 (xidsum, xidsum ); // [ssse3] signed 32-bit horizontal addition xidsum = _ mm_hadd_epi32 (xidsum, xidsum); S + = _ mm_cvtsi128_si32 (xidsum ); // [sse2] returns a low 32-bit return s ;}

Code from --
Http://topic.csdn.net/u/20120102/01/fc8d7aa4-bffc-4d9a-a34a-5056c6d27b54.html
# 9th floor

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.