SSE has eight 128-bit independent registers (xmm1 ~ Xmm7), instruction description Convention:
Mm refers to the 64-bit MMX register.
XMM refers to the 128xmm register
M32 is a 32-Bit Memory variable.
M128 refers to the 128-Bit Memory variable
1. Data Transmission commands
Movaps XMM, XMM/m128 movaps XMM/m128, XMM
Sends the content value of the source memory to the destination register. When m128 exists,The memory address must be 16 bytes aligned.
Movups XMM, XMM/m128 movaps XMM/m128, XMM
AndThe results of movaps execution are the same,The memory address may not be 16-byte aligned and the speed is notMovaps fast!
MovlpsXMM, M64
The 64-bit content of the source memory is sent to the destination register, which is 64-bit low and the 64-Bit High remains unchanged,The memory address does not need to be 16 bytes aligned.
Movhps XMM, M64
The 64-bit content of the source memory is sent to the destination register with a high 64-bit value, and the 64-bit content remains unchanged,The memory address does not need to be 16 bytes aligned.
MovhlpsXMM, XMM
Sends 64-Bit High source register to 64-bit low destination register, and the 64-bit high register remains unchanged.
Movlhps XMM, XMM
Send 64-bit low source register to 64-bit high destination register without changing the 64-bit low register.
MovssXMM, XMM/M32
The low 32-bit source register is sent to the low 32-bit destination register. If the source 32 is a memory variable, the other bits in the destination register are cleared. Otherwise, the value remains unchanged.
2. Single-precision floating-point arithmetic operation commands
SSE's vertex arithmetic operation commands can be divided into two types: packed and scalar. The packed Command performs the same operation on four floating point numbers in the XMM register at a time, while scalar only performs operations on the lowest 32-bit floating point numbers in the XMM register, and the 96-bit height remains unchanged, for example:
AddpsXMM, XMM/m128
AddssXMM, XMM/M32
SubpsXMM, XMM/m128
SubssXMM, XMM/M32
MulpsXMM, XMM/m128
MulssXMM, XMM/M32
DivpsXMM, XMM/m128
DivssXMM, XMM/M32
MaxpsXMM, XMM/m128
MaxssXMM, XMM/M32
MinpsXMM, XMM/m128
MinssXMM, XMM/M32
RcppsXMM, XMM/m128
RcpssXMM, XMM/M32
RsqrtpsXMM, XMM/m128
RsqrtssXMM, XMM/M32
Key point: the instruction ending with the PS suffix. If the source operand is memory, the memory address must be 16 bytes aligned. The command ending with the SS suffix does not have this restriction.
3. Bit operation commands
AndpsXMM, XMM/m128
The source memory has 128 binary bits and the destination register has 128 binary bits. The result is sent to the destination register. The memory variable address must be 16 bytes aligned.
OrpsXMM, XMM/m128
The source memory has 128 binary bits or 'destination register has 128 binary bits. The result is sent to the destination register. The memory variable address must be 16 bytes aligned.
XorpsXMM, XMM/m128
The source memory has 128 binary bits. The destination register has 128 binary bits. The result is sent to the destination register. The memory variable address must be 16 bytes aligned.