We will slightly modify the code to make it easier to understand some macro definitions into functions:
# Include "stdafx. h "# include <stdio. h>/* fast strcpy -- Copyright (C) 2003 Thomas M. ogrisegg <tom@hi-tek.fnord.at> * // # include <string. h> // # include "dietfeatures. h "// # include" dietstring. h "// ---- following are dietstring. h content. // # include <endian. h> // # define MKW (x) (x | x <8 | x <16 | x <24) int MKW (int x) {x = x | x <8 | x <16 | x <24; return x ;}
// # Define STRALIGN (x) (unsigned long) x & 3 )? 4-(unsigned long) x & 3): 0) unsigned long STRALIGN (unsigned long xPtr) {unsigned long xRet = (unsigned long) xPtr & 3; if (xRet) xRet = 4-(unsigned long) xPtr & 3); elsexRet = 0; return xRet ;}
/* GFC (x)-returns first character * // * INCSTR (x)-moves to next character */# define GFC (x) & 0xff) # define INCSTR (x) do {x> = 8;} while (0)
// # Define UNALIGNED (x, y) (unsigned long) x & (sizeof (unsigned long)-1) ^ (unsigned long) y & (sizeof (unsigned long)-1) unsigned long MyUnaligned (unsigned long xPtr, unsigned long yPtr) {unsigned long valN1 = sizeof (unsigned long)-1; unsigned long xVal = (unsigned long) xPtr & valN1; unsigned long yVal = (unsigned long) yPtr & valN1; unsigned long retVal = xVal ^ yVal; return retVal ;} // ---- above are dietstring. h content.
Char * strcpy2 (char * s1, const char * s2) {char * res = s1; int tmp; unsigned long l;
If (MyUnaligned (unsigned long) s1, (unsigned long) s2) {while (* s1 ++ = * s2 ++); return (res );}
If (tmp = STRALIGN (unsigned long) s1) {while (tmp -- & (* s1 ++ = * s2 ++); if (tmp! =-1) return (res );}
While (1) {unsigned long key1 = MKW (0x1ul); unsigned long key2 = MKW (0x80ul );
L = * (const unsigned long *) s2;
If (l-key1 )&~ L) & key2) {while (* s1 ++ = GFC (l) INCSTR (l); return (res );}
* (Unsigned long *) s1 = l; s2 + = sizeof (unsigned long); s1 + = sizeof (unsigned long );}}
Int _ tmain (int argc, _ TCHAR * argv []) {char * p = (char *) malloc (50 * sizeof p); char * str = "aaaabbbbbcccccc "; strcpy2 (p, str); free (p); return 0;} I changed it to strcpy2.
If you compile and run the above program, you will find that the reason is that the last part of the strcpy2 function in the while loop contains the following lines:
*(unsigned long *) s1 = l;s2 += sizeof(unsigned long);s1 += sizeof(unsigned long);
Anyone who knows about the C language pointer knows that the first line is to assign the value of the unsigned long type variable l to an unsigned long pointer pointing to the address s1.
On my i386cpu PC, the second and third lines increase the s2 and s1 pointers by 4 (instead of the ++ in the general function implementation ). This achieves copying four char (that is, an unsigned long) at a time instead of just copying one char.
The function before strcpy2 is to ensure that the copy can be correctly executed.
Let's first look at the function MyUnaligned (originally an UNALIGNED macro in dietlibc ).
First take a value of sizeof (unsigned long)-1, and then combine both the source string pointer and the target string pointer with this value (xPtr & valN1 ), perform an exclusive or xor operation on the last two results (xVal ^ yVal ).
To put it simply, xPtr & valN1 is equivalent to a modulo operation. The value of valN1 on the i386 cpu is 3, that is, the result of the sum may be 0, 1, 2, 3, when the xPtr or yPtr value is a multiple of 4, the result of the operation is 0. The two values are the same or as the operation result. If both values are 0 or 1, the return value is 0. That is, as long as there is a pointer that is not aligned, it will honestly make a copy of each char (* s1 ++ = * s2 ++), and then return it from strcpy2.
This algorithm is used to ensure that the xPtr and yPtr pointers are aligned in the memory. If there is no aligned pointer and four char values need to be assigned at a time, it may cause memory writing errors (refer to this Article ).Http://en.wikipedia.org/wiki/Data_structure_alignment).
Some people have already seen that if the source pointer and target pointer are not aligned and the xor result is zero, isn't it wrong?
OK. Do you still have a piece of code? In STRALIGN, the target string pointer address is modeled and the remainder is returned, for example, we manually modify the s1 and s2 address during the running process to + 1. Run the debug command, for example, to get the p and str addresses, we can see that all the values are aligned on the unsigned long boundary (p & 3 must be 0 ).
We directly modify the address in the Autos window and add one to the address, for example:
In this way, the two pointers are not aligned. Continue running: