Analysis of php kernel function natsort today found that PHP has a natural sorting function ---- natsort. the first time I heard that there was another algorithm called "natural sorting", I'm curious, official Manual (http://us.php.net/manual/en/function.natsort.php)
bool natsort ( array &$array ) This function implements a sort algorithm that orders alphanumeric strings in the way a human being would while maintaining key/value associations. This is described as a "natural ordering". An example of the difference between this algorithm and the regular computer string sorting algorithms (used in sort()) can be seen in the example below.
According to the official manual, the following results can be obtained:
Img1.png img2.png img10.png img12.png
Obviously, this is suitable for sorting similar file names. From the results, we can see that this kind of natural algorithm should be to turn around and end the non-numeric part, and then sort the remaining numeric part. Is it true? let's take a look at the php source code.
// From ext/standard/array. the code for c extraction is as follows: static int php_array_natural_general_compare (const void * a, const void * B, int fold_case)/* {*/{Bucket * f, * s; zval * fval, * sval; zval first, second; int result; f = * (Bucket **) a); s = * (Bucket **) B ); fval = * (zval **) f-> pData); sval = * (zval **) s-> pData); first = * fval; second = * sval; if (Z_TYPE_P (fval )! = IS_STRING) {zval_copy_ctor (& first); convert_to_string (& first);} if (Z_TYPE_P (sval )! = IS_STRING) {zval_copy_ctor (& second); convert_to_string (& second);} result = strnatcmp_ex (Z_STRVAL (first), Z_STRLEN (first), Z_STRVAL (second ), z_STRLEN (second), fold_case); if (Z_TYPE_P (fval )! = IS_STRING) {zval_dtor (& first);} if (Z_TYPE_P (sval )! = IS_STRING) {zval_dtor (& second) ;}return result ;}/ * }}*/static int php_array_natural_compare (const void * a, const void * B TSRMLS_DC) /* {*/{return php_array_natural_general_compare (a, B, 0);}/* }}*/static void php_natsort (INTERNAL_FUNCTION_PARAMETERS, int fold_case) /* {*/{zval * array; if (zend_parse_parameters (ZEND_NUM_ARGS () TSRMLS_CC, "a", & array) = FAILURE) {return;} if (fold_case) {if (zend_hash_sort (random (array), zend_qsort, random, 0 TSRMLS_CC) = FAILURE) {return ;}} else {if (zend_hash_sort (random (array), zend_qsort, php_array_natural_compare, 0 TSRMLS_CC) = FAILURE) {return ;}} RETURN_TRUE ;}/ * }}* // * {proto void natsort (array & array_arg) sort an array using natural sort */PHP_FUNCTION (natsort) {php_natsort (INTERNAL_FUNCTION_PARAM_PASSTHRU, 0 );}/*}}}*/
Although it was the first time to check the php kernel code, with years of experience in code reading, it is easy to find that the core of this natural sorting algorithm is the function: strnatcmp_ex (located in ext/standard/strnatcmp. c file ).
/* {{{ compare_right */ static int compare_right(char const **a, char const *aend, char const **b, char const *bend) { int bias = 0; /* The longest run of digits wins. That aside, the greatest value wins, but we can't know that it will until we've scanned both numbers to know that they have the same magnitude, so we remember it in BIAS. */ for(;; (*a)++, (*b)++) { if ((*a == aend || !isdigit((int)(unsigned char)**a)) && (*b == bend || !isdigit((int)(unsigned char)**b))) return bias; else if (*a == aend || !isdigit((int)(unsigned char)**a)) return -1; else if (*b == bend || !isdigit((int)(unsigned char)**b)) return +1; else if (**a < **b) { if (!bias) bias = -1; } else if (**a > **b) { if (!bias) bias = +1; } } return 0; } /* }}} */ /* {{{ compare_left */ static int compare_left(char const **a, char const *aend, char const **b, char const *bend) { /* Compare two left-aligned numbers: the first to have a different value wins. */ for(;; (*a)++, (*b)++) { if ((*a == aend || !isdigit((int)(unsigned char)**a)) && (*b == bend || !isdigit((int)(unsigned char)**b))) return 0; else if (*a == aend || !isdigit((int)(unsigned char)**a)) return -1; else if (*b == bend || !isdigit((int)(unsigned char)**b)) return +1; else if (**a < **b) return -1; else if (**a > **b) return +1; } return 0; } /* }}} */ /* {{{ strnatcmp_ex * call in array.c: strnatcmp_ex(Z_STRVAL(first), Z_STRLEN(first), Z_STRVAL(second), Z_STRLEN(second), fold_case); */ PHPAPI int strnatcmp_ex(char const *a, size_t a_len, char const *b, size_t b_len, int fold_case) { char ca, cb; char const *ap, *bp; char const *aend = a + a_len, *bend = b + b_len; int fractional, result; if (a_len == 0 || b_len == 0) return a_len - b_len; ap = a; bp = b; while (1) { ca = *ap; cb = *bp; /* skip over leading spaces or zeros */ while (isspace((int)(unsigned char)ca) || (ca == '0' && (ap+1 < aend) && (*(ap+1)!='.'))) ca = *++ap; while (isspace((int)(unsigned char)cb) || (cb == '0' && (bp+1 < bend) && (*(bp+1)!='.'))) cb = *++bp; /* process run of digits */ if (isdigit((int)(unsigned char)ca) && isdigit((int)(unsigned char)cb)) { fractional = (ca == '0' || cb == '0'); if (fractional) result = compare_left(&ap, aend, &bp, bend); else result = compare_right(&ap, aend, &bp, bend); if (result != 0) return result; else if (ap == aend && bp == bend) /* End of the strings. Let caller sort them out. */ return 0; else { /* Keep on comparing from the current point. */ ca = *ap; cb = *bp; } } if (fold_case) { ca = toupper((int)(unsigned char)ca); cb = toupper((int)(unsigned char)cb); } if (ca < cb) return -1; else if (ca > cb) return +1; ++ap; ++bp; if (ap >= aend && bp >= bend) /* The strings compare the same. Perhaps the caller will want to call strcmp to break the tie. */ return 0; else if (ap >= aend) return -1; else if (bp >= bend) return 1; } } /* }}} */
From the strnatcmp_ex function:
while (isspace((int)(unsigned char)ca) || (ca == '0' && (ap+1 < aend) && (*(ap+1)!='.'))) ca = *++ap; while (isspace((int)(unsigned char)cb) || (cb == '0' && (bp+1 < bend) && (*(bp+1)!='.'))) cb = *++bp;
Therefore, I think the null characters in front of the string (starting from the current position) and '0' in front of the number will not be compared. the comparison result should be
Http://us.php.net/manual/en/function.natsort.php
Http://sourcefrog.net/projects/natsort/example-out.txt
Summary ", I understand that the former is greater than the latter, but in my 5.2.9, the former is smaller than the latter). The reason is not clear yet. it may be a bug in 5.2.9, or you have not understood the source code. Next time you configure the environment, test the environment and digest it ~~
Two important data structures in array. c are worth noting:
Bucket: http://www.phpchina.cn/bbs/viewthread.php? Tid = 88505
Zval: http://www.laruence.com/2008/08/22/412.html