Here we will briefly describe several important arithmetic operation commands.
1. Horizontal addition command
The ssse3 Instruction Set adds commands that add integers horizontally, similar to the floating point commands of sse3.
Phaddd
Add a 32-bit unsigned integer in the horizontal direction of the Register
Phaddw
Add the registers horizontally according to the unsigned 16-bit integer
MRS and MSR InstructionsThe status flag in the MRS (move PSR to general-purpose Register) procedure status register (program state REGISTER,PSR) is read into the Universal Register, MSR (move general-purpose Register to PSR) writes the flag to the program status register. These two instructions can be used in the current program status register (CPSR) or in the Saved program status register (SPSR). For example:MRS R0, Cpsrmrs R1, SPSRMSR PSR, R7BIC directivesThe BIC is used to place the specifie
Recently learning about Linux virtual machines, here are some of the instructions I used in the process of learning virtual machines, and their roles.Pwd-> list the current directory pathLs-> lists the current directory listcd-> Changing directoriesmkdir-> Creating a Directory commandTar-> Compress Package Operations CommandGunzip->. GZ Compressed Package DecompressionRm-> Deleting an action fileMv-> cutting files to the destinationcp-> copying files to the destination locationVi/vim-> writing t
Only VS2002 and above support the SSE command function library
Currently, most CPUs (Intel and AMD) on the market support the SSE instruction set.
The SSE command function must contain the following header files:
# Include
The details of the SSE command are not described in detail here. Here we only talk about the batchcompute function.
However, this batch operation only processes four 32-bit characters
The processor uses the ARMv6-M thumb instruction set, including a large number of 32-bit instructions using the thumb-2 technology. Table 7-22 lists the Cortex-M0 instructions and their cycles. The cycle count is based on the system in the Zero Wait state.
Table 7-22 Cortex-M0 instructions and their cycles
Operation
Description
Sort order
Weekly
Move
8-bit immediate
Movs Rd,
Cmd.exePath path The file name of the executable file is set to an executable file.CMD launches a Win2K command Interpretation window. Parameters:/eff,/en Close, open command extension, more details see CMD/?REGEDIT/S registry File name Import registry, parameters/s refers to quiet mode import, without any hint;regedit/e registry File name Export RegistryThe cacls filename parameter displays or modifies the File access control List (ACL)-when it is f
Some il language explanations:
Jump command set
Public field static beq if the two values are equal, the control is transferred to the target command.Public field static beq_s if the two values are equal, the control is transferred to the target command (short format ).Public field static Bge if the first value is greater than or equal to the second value, the control is transferred to the target command.Public field static bge_s if the first value
application cannot directly write r700 local memory, but it can command r700ProgramAnd data are copied between system memory and r700 memory. There are two methods for writing CPU to GPU memory:
1. Request the DMA engine of the GPU to write data to it by pointing to the location of the source data on the CPU memory, and then pointing to the offset in the GPU memory to be written.
2. Load a kernel and run it on the shader. The shader accesses the memory through the PCIe connection, processes
int, if one is greater than the second one, jumps
If_icmpge Branchbyte1,branchbyte2
stack pops two int, if one is greater than or equal to the second, jump
lcmp
Pop two long from the stack, put the result-1, 0, 1 into the stack
Fcmpg
Pop two float from the stack, put the results-1, 0, 1 into the stack
Fcmpl
Pop two float from the stack, put the results-1, 0, 1 into the stack
Dcmpg
POPs two double from
Register:Register number Symbol Name purpose0 always 0 looks like a waste, actually very useful1 at reserved for assembler use2-3 v0,v1 function return value4-7 a0-a3 several function parameters8-15 t0-t7 Temporary Register, sub-process can be used without saving24-25 T8,t9 Ibid.16-23 S0-S7 Register variable, a sub-procedure must be saved before it can be usedThen recover before exiting to retain the value required by the caller26,27 K0,K1 reserved for exception handling function useGP global po
(f) registers.
Previous vector (PV)-- R; no; 1; 128 (4*32-bit); registers that contain the results of previous Alu. [x, y, z, w] operations ). This status continues only for one ALU clause.
Previous scalar (PS)-- R; no; 1; 32; a register that contains the results of previous Alu. Trans operations. This status continues only for one ALU clause.
Local Data Sharing (LDS)-- R/W read: 16 KB; write: 16 to 256 bytes. Each thread can both read and write; no; each SIMD of the On-chip memory is shared
For more information about realboard, visit the official website www.hugacy.com.
The fastest arm Instruction Set Simulator (twice the qemu performance) can directly run elf and wince programs. (Including test code)This is the fastest emulator for ARM, 2x faster than qemu, it is available to run program of ELF and wince. (speed test supported DED)
Results of speed testing at 3.0 GHz and XP:
E:/work/
4. Data shuffling Instruction Set
UnpckhpsXMM, XMM/m128
The source memory and destination register are 64-bitDouble-CharacterThe result is sent to the destination register. The memory variables must be 16 bytes aligned with the memory.High 64-bit | low 64-bitDestination register: A0 | A1 | A2 | A3Source memory: B0 | B1 | B2 | B3Destination register result: B0 | A0 | B1 | A1Example:When xmm0 = 0x 0c517e0
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.