Sometimes our program requires some very high execution efficiency or the implementation of the system at the bottom of the functional modules, these key parts we can use inline assembly directly into the assembly instructions to meet our requirements, the following are several techniques to discuss with you.
1. Inline assembler Embedded VC statement:
In the VC inline assembly is very convenient, only need to follow the following format
__asm{
Assembly statement
}
Consider the following sample code
void CAlcmemDlg::OnButton3()
{
DWORD d=(m_size*1024*1024)/sizeof(DWORD);
DWORD*p=(DWORD*)m_p;
DWORD s;
m_pr.SetMin(0);
m_pr.SetMax((float)d);
m_pr.SetEnabled(TRUE);
if(NULL!=m_p){
__asm{
mov ecx,d
mov eax,0
L: mov edx,DWORD ptr p
mov [edx+eax],1 //随便写入数据,此处写入1
inc eax
mov s,eax
pushad
}
m_pr.SetValue((float)s);
__asm{
popad
loop L
}
}
}
Note the Pushad and POPAD statements in the two __asm blocks in the sample code, Pushad Save the Register environment, Popad Restore the register environment and make M_PR. SetValue ((float) s); the effect of a statement on a register is offset, and you can invoke any other statement. However, it is recommended that you interrupt the inline assembly block as little as possible to reduce the time it takes to go back and forth to the Register environment. The author of the test is, when deleted M_PR. SetValue ((float) s), and merging two __asm blocks, and removing Pushad, and popad, the speed is significantly increased. It can be seen that this interruption is usually not worth the candle.
Usually the register environment to be saved is the flags register and so on, depending on the circumstances.
2. Free use of fpu,mmx and other instructionsvoid CAlcmemDlg::OnButton4()
{
float f_t=.132;
float f_s=0;
__asm{
fld f_s
fld f_s
fld f_s
fld f_t
fadd f_t
fst f_t
// fadd fs
}
}
You can use the method of setting breakpoints to observe the FPU registers, and usually the code you write with VC will not be compiled into code that references a particular instruction set, although Microsoft claims that the compiler supports these instructions. So you have to use inline assembler to invoke these instructions to optimize the program and make the most of the resources. The code in the example invokes the instructions of the FPU processor, allowing the ability to manipulate floating-point numbers to be fully played. But of course you can also call 3dnow! instructions, SSE,SSE2 and other instructions, but the author did not try, if you have any new discoveries, also hope Enlighten, and then thank you!
In general, inline assembly to improve the speed, especially game programming, should strive to use inline assembly, the CPU fully squeezed dry, but the disadvantage is that some low-end machine can not run, poor compatibility. At the same time, Microsoft also claims that the compiler will not optimize your write the remittance code, he is simply translated into the equivalent machine code, the optimization of things to your own to complete, so you not only have to be a master of C + +, but also a master assembly. But as far as I know, this kind of person is a hacker. I do not encourage you to be that kind of person, writing this article only for a comment.
This article supporting source code