MMX instruction set generation II

Source: Internet
Author: User

4. Data shift command

PsllwMm, mm/M64

PsllwMm, imm8

Specify the number of digits of the destination register by words in the source memory (or imm8 immediate count)Logical left shiftThe removed bits are lost.
The characters removed from the lower-level domain name are not moved into the higher-level domain name.
Example:
When mm0 = 0 xFFFF FFFF, run psllw mm0, 1
Mm0 = 0 xfffe fffe

PsrlwMm, mm/M64

PsrlwMm, imm8

Specify the number of digits of the destination register by words in the source memory (or imm8 immediate count)Right Shift of LogicThe removed bits are lost.
The characters with high characters removed will not be moved into the lower ones.
Example:
When mm0 = 0 xFFFF FFFF, run psrlw mm0, 1
Mm0 = 0x7fff 7fff 7fff 7fff

PslldMm, mm/M64

PslldMm, mm imm8

Specify the number of digits in the source memory (or imm8 immediate number) for the destination register by double-CharacterLogical left shiftThe removed bits are lost.
The bit removed from the low double-word will not be moved into the high double-word.
Example:
When mm0 = 0 xffffffff ffffff, run pslld mm0, 1
Mm0 = 0 xfffffffe fffffffe

 PsrldMm, mm/M64

 PsrldMm, imm8
 

Specify the number of digits in the source memory (or imm8 immediate number) for the destination register by double-CharacterRight Shift of LogicThe removed bits are lost.
The bit removed from the double-byte won't be moved into the double-byte.
Example:
When mm0 = 0 xffffffff ffffff, run psrld mm0, 1
Mm0 = 0x7fffffff 7 fffffff

5. Multiplication instructions

Pmullw mm, mm/M64

In parallel, 16 bits are multiplied by words. The result is 16 bits low, and the corresponding words are put into the destination register.

Example:
When mm0 = 0x0000 0002 ACFE

MM1 = 0x0000 0009 0000 CeF3, execute pmullw,

Then mm0 = 0x0000 0000 0012 991a
2*9 = 18, 18 = 0000 0012 h. The result is 16-bit 0012 lower.
0x0acfe =-21250, 0xcef3 =-12557,-21250 *-12557 = 266836250 = 0x 0fe7 991a. The result is 16-bit 991a lower.

Pmulhw mm, mm/M64
In parallel, 16 bits are multiplied by words. The result is 16 bits in height and the corresponding words in the destination register are obtained.

Example:
When mm0 = 0x0000 0002 ACFE

MM1 = 0x0000 0009 0000 CeF3, execute pmulhw,

Mm0 = 0x0000 0000 0000 0fe7
2*9 = 18, 18 = 0000 0012 h. The result is 16-bit 0000 in height.
0x0acfe =-21250, 0xcef3 =-12557,-21250 *-12557 = 266836250 = 0x 0fe7 991a, the result is 16-bit high 0fe7.

Pmaddwd mm, mm/M64
Align the point multiplication of signed vectors by words.
High 32-bit | low 32-bit
Destination register: A0 | A1 | A2 | A3
Source register: B0 | B1 | B2 | B3
Destination register result: A0 * b0 + A1 * B1 | A2 * B2 + A3 * B3

 

Summary:

1. The shift Command performs parallel shift based on 16-bit or 32-bit.

2. Shift commands are divided into logical left shift and logical right shift.

3. There are only three multiplication commands. The data unit of parallel multiplication is the 16-bit signed number.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.