Memory usage can be said to be extremely important to programmers, especially in large-scale data processing, how to reach the maximum memory bandwidth is the goal pursued by programmers.
Starting from this section, we will discuss through a series of examples. This series uses assembly languages. Only one example is memory copy, which is the easiest to understand and the easiest to give an example.
The Code provided in this section is baseline, which is the worst method. In the future, the methods will be much better than this one, including simplified commands, memory prefetch, and MMX registers, block replication and other technologies, hope readers can continue reading. The following goals are expected:
(1) After reading this small series, some basic commands on assembly and memory will be well understood and applied.
(2) deep understanding of memory principles and registers
According to the odd tricks, a detailed explanation of the code will be provided in the subsequent sections. The following code is used to compile gcc-g-o test main. C. Limited to 64-bit machines.
# Include "stdio. H "<br/> # include" stdlib. H "<br/> # include" string. H "<br/> # If defined (_ i386 _) <br/> static _ inline _ unsigned long rdtsc (void) <br/>{< br/> unsigned long int X; <br/> _ ASM _ volatile (". byte 0x0f, 0x31 ":" = A "(x); <br/> return X; <br/>}< br/> # Elif defined (_ x86_64 _) <br/> static _ inline _ unsigned long rdtsc (void) <br/>{< br/> unsigned hi, lo; <br/> _ ASM _ volatile _ ("rdtsc ": "= A" (LO), "= D" (HI); <br/> return (unsigned long) LO) | (unsigned long) HI) <32); <br/>}< br/> # endif <br/> ASM (". text "); <br/> ASM (". type m_ B _64, @ function "); <br/> ASM (" m_ B _64: Push % RBP "); <br/> ASM (" mov % RSP, % RBP "); <br/> ASM ("mov % RDX, % rcX"); <br/> ASM ("rep movsq"); <br/> ASM ("leaveq "); <br/> ASM ("retq"); <br/> ASM (". text "); <br/> ASM (". type m_ B _32, @ function "); <br/> ASM (" m_ B _32: Push % RBP "); <br/> ASM (" mov % RSP, % RBP "); <br/> ASM ("mov % RDX, % rcX"); <br/> ASM ("rep movsd"); <br/> ASM ("leaveq "); <br/> ASM ("retq"); <br/> ASM (". text "); <br/> ASM (". type m_ B _16, @ function "); <br/> ASM (" m_ B _16: Push % RBP "); <br/> ASM (" mov % RSP, % RBP "); <br/> ASM ("mov % RDX, % rcX"); <br/> ASM ("rep movsw"); <br/> ASM ("leaveq "); <br/> ASM ("retq"); <br/> ASM (". text "); <br/> ASM (". type m_ B _8, @ function "); <br/> ASM (" m_ B _8: Push % RBP "); <br/> ASM (" mov % RSP, % RBP "); <br/> ASM ("mov % RDX, % rcX"); <br/> ASM ("rep movsb"); <br/> ASM ("leaveq "); <br/> ASM ("retq"); <br/> int main (void) <br/> {<br/> int bytes_cnt = 32*1024*1024; // 32 M bytes <br/> int word_cnt = bytes_cnt/2; // 16 m words <br/> int dword_cnt = word_cnt/2; // 8 m double words <br/> int qdword_cnt = dword_cnt/2; // 4 m quad words <br/> char * From = (char *) malloc (bytes_cnt ); <br/> char * To = (char *) malloc (bytes_cnt); <br/> memset (from, 0x2, bytes_cnt); <br/> memset (, 0x0, bytes_cnt); <br/> unsigned long start; <br/> unsigned long end; <br/> int I; <br/> for (I = 0; I <10; ++ I) <br/>{< br/> Start = rdtsc (); <br/> m_ B _8 (to, from, bytes_cnt); <br/> end = rdtsc (); <br/> printf ("m_ B _8 use time: /T % d/N ", end-Start); <br/>}< br/> for (I = 0; I <10; ++ I) <br/>{< br/> Start = rdtsc (); <br/> m_ B _16 (to, from, word_cnt); <br/> end = rdtsc (); <br/> printf ("m_ B _16 use time:/T % d/N", end-Start); <br/>}< br/> for (I = 0; I <10; ++ I) <br/>{< br/> Start = rdtsc (); <br/> m_ B _32 (to, from, dword_cnt ); <br/> end = rdtsc (); <br/> printf ("m_ B _32 use time:/T % d/N", end-Start ); <br/>}< br/> for (I = 0; I <10; ++ I) <br/>{< br/> Start = rdtsc (); <br/> m_ B _64 (to, from, qdword_cnt); <br/> end = rdtsc (); <br/> printf ("m_ B _64 use time: /T % d/N ", end-Start ); <br/>}< br/>/* use to make sure CPY is OK ****** <br/> int sum = 0; <br/> int I = 0; <br/> for (I = 0; I <bytes_cnt; ++ I) <br/> sum + = to [I]; <br/> printf ("% d/N", sum ); <br/> ********************************/< br/> return 0; <br/>}</P> <p>
This article continued: http://blog.csdn.net/pennyliang/archive/2011/03/10/6238448.aspx
Other articles in this series: http://blog.csdn.net/pennyliang/category/746545.aspx