Recently, I have been thinking about the efficiency of string operations. the interception of strings will not avoid the consumption of space re-allocation. I also read the source code of the explode function, understand, use your own analysis and sharing. when we need to split an array into an array based on a character or string, The explode is very happy, but you know ~ How does explode work ~~
First of all, it is certain that explode will also allocate space, without a doubt.
The code is as follows:
// File 1: ext/standard/string. c
// First, let's look at the source code of explode.
PHP_FUNCTION (explode)
{
Char * str, * delim;
Int str_len = 0, delim_len = 0;
Long limit = LONG_MAX;/* No limit */
Zval zdelim, zstr;
If (zend_parse_parameters (ZEND_NUM_ARGS () TSRMLS_CC, "ss | l", & delim, & delim_len, & str, & str_len, & limit) = FAILURE ){
Return;
}
If (delim_len = 0 ){
Php_error_docref (NULL TSRMLS_CC, E_WARNING, "Empty delimiter ");
RETURN_FALSE;
}
// An array is opened to store the split data.
Array_init (return_value );
// Because of this, we use explode ('| ', '');
If (str_len = 0 ){
If (limit> = 0 ){
Add_next_index_stringl (return_value, "", sizeof ("")-1, 1 );
}
Return;
}
// The following two construct the _ zval_struct structure for both the original string and delimiter,
// ZVAL_STRINGL will allocate space ~~ The source code is then pasted
ZVAL_STRINGL (& zstr, str, str_len, 0 );
ZVAL_STRINGL (& zdelim, delim, delim_len, 0 );
// The limit value is the third parameter of the explode that can be passed in explode, which allows positive and negative
If (limit> 1 ){
Php_explode (& zdelim, & zstr, return_value, limit );
} Else if (limit <0 ){
Php_explode_negative_limit (& zdelim, & zstr, return_value, limit );
} Else {
Add_index_stringl (return_value, 0, str, str_len, 1 );
}
}
The code is as follows:
// Source code of ZVAL_STRINGL:
// File 2: zend/zend_API.c
# Define ZVAL_STRINGL (z, s, l, duplicate ){\
Const char * _ s = (s); int _ l = l ;\
Z_STRLEN_P (z) = _ l ;\
Z_STRVAL_P (z) = (duplicate? Estrndup (_ s, _ l) :( char *) _ s );\
Z_TYPE_P (z) = IS_STRING ;\
}
....
// Estrndup is the main course:
// File 3: zend/zend_alloc.h
# Define estrndup (s, length) _ estrndup (s), (length) ZEND_FILE_LINE_CC ZEND_FILE_LINE_EMPTY_CC)
....
// _ Implementation of estrndup: zend/zend_alloc.c
ZEND_API char * _ estrndup (const char * s, uint length ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC)
{
Char * p;
P = (char *) _ emalloc (length + 1 ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC );
If (UNEXPECTED (p = NULL )){
Return p;
}
Memcpy (p, s, length); // allocate space
P [length] = 0;
Return p;
}
// In addition, ZVAL_STRING used in substr and strrchr strstr also uses the implementation of appeal.
The following is called based on the third parameter limit of explode: The condition corresponds to the last three rows in explode, and the limit condition is different.
Note: The default value of limit is LONG_MAX, which belongs to branch 1.
1. limit> 1:
Call the php_explode method, which can also be in ext/standard/string. found in c, and appears immediately after The explode implementation (so it is very convenient to call methods from this file in the search function, almost none of the columns are on the top of the function. ^_^ ),
The code is as follows:
PHPAPI void php_explode (zval * delim, zval * str, zval * return_value, long limit)
{
Char * p1, * p2, * endp;
// Obtain the pointer at the end of the source string.
Endp = Z_STRVAL_P (str) + Z_STRLEN_P (str );
// Record start position
P1 = Z_STRVAL_P (str );
// The following figure shows the position of the delimiter in str. we can see that this method is also used to locate the delimiter in strrpos and strpos.
P2 = php_memnstr (Z_STRVAL_P (str), Z_STRVAL_P (delim), Z_STRLEN_P (delim), endp );
If (p2 = NULL ){
// Because of this, when we call explode ('|', 'ABC'); it is legal, and the output is array (0 => 'ABC ')
Add_next_index_stringl (return_value, p1, Z_STRLEN_P (str), 1 );
} Else {
// Cyclically obtain the location of the next separator until the end
Do {
// Obtain the sub-string (from the last position to the middle of this position, the first time the last position is the start
Add_next_index_stringl (return_value, p1, p2-p1, 1 );
// Locate the separator position p2 + the length of the separator
// For example, if the separator is '|', the original string is 'AB | C', p2 = 2, then p1 = 2 + 1 = 3
P1 = p2 + Z_STRLEN_P (delim );
} While (p2 = php_memnstr (p1, Z_STRVAL_P (delim), Z_STRLEN_P (delim), endp ))! = NULL &&
-- Limit> 1 );
// Put the string following the last separator in the result array
// Explode ('|', 'AVC | sdf '); => array (0 => 'avc', 1 => 'sdf ')
If (p1 <= endp)
Add_next_index_stringl (return_value, p1. endp-p1, 1 );
}
}
2. limit <0:
Call the php_explode_negative_limit method
The code is as follows:
PHPAPI void php_explode_negative_limit (zval * delim, zval * str, zval * return_value, long limit)
{
# Define EXPLODE_ALLOC_STEP 64
Char * p1, * p2, * endp;
Endp = Z_STRVAL_P (str) + Z_STRLEN_P (str );
P1 = Z_STRVAL_P (str );
P2 = php_memnstr (Z_STRVAL_P (str), Z_STRVAL_P (delim), Z_STRLEN_P (delim), endp );
If (p2 = NULL ){
// It is not processed here, so explode ('|', 'ABC',-1) is invalid and no value can be obtained.
/*
Do nothing since limit <=-1, thus if only one chunk-1 + (limit) <= 0
By doing nothing we return empty array
*/
} Else {
Int allocated = EXPLODE_ALLOC_STEP, found = 0;
Long I, to_return;
Char ** positions = emalloc (allocated * sizeof (char *));
// Note the positions Declaration. this array is used to save the reading position of all sub-strings.
Positions [found ++] = p1; // of course, the start position still needs to be saved
// The following two loops. The first one is to loop all the separators that appear in the string, and save the reading position of the next substring.
Do {
If (found> = allocated ){
Allocated = found + EXPLODE_ALLOC_STEP;/* make sure we have enough memory */
Positions = erealloc (positions, allocated * sizeof (char *));
}
Positions [found ++] = p1 = p2 + Z_STRLEN_P (delim );
} While (p2 = php_memnstr (p1, Z_STRVAL_P (delim), Z_STRLEN_P (delim), endp ))! = NULL );
// This is the substring from which the returned results will be read from the array.
To_return = limit + found;
/* Limit is at least-1 therefore no need of bounds checking: I will be always less than found */
For (I = 0; I <to_return; I ++) {/* this checks also for to_return> 0 */
Add_next_index_stringl (return_value, positions [I],
(Positions [I + 1]-Z_STRLEN_P (delim)-positions [I],
1
);
}
Efree (positions); // It is very important to release the memory.
}
# Undef EXPLODE_ALLOC_STEP
}
3. limit = 1 or limit = 0:
When all the first and second conditions are not met, the branch is entered. the branch is very simple to put the source string in the output array, explode ('| ', 'AVC | sD', 1) or explode ('|', 'AVC | sD', 0) will return array (0 => 'AVC | sD ');
The code is as follows:
// Add_index_stringl source code
// File 4: zend/zend_API.c
ZEND_API int add_next_index_stringl (zval * arg, const char * str, uint length, int duplicate )/*{{{*/
{
Zval * tmp;
MAKE_STD_ZVAL (tmp );
ZVAL_STRINGL (tmp, str, length, duplicate );
Return zend_hash_next_index_insert (Z_ARRVAL_P (arg), & tmp, sizeof (zval *), NULL );
}
// Zend_hash_next_index_insert
// Zend/zend_hash.h
# Define zend_hash_next_index_insert (ht, pData, nDataSize, pDest )\
_ Zend_hash_index_update_or_next_insert (ht, 0, pData, nDataSize, pDest, HASH_NEXT_INSERT ZEND_FILE_LINE_CC)
// Zend/zend_hash.c
/// Too long ~~~~ No More
Visible (excluding allocated space ),
When the limit value is greater than 1, the efficiency is O (N) [N is the limit value ],
When limit is <0, the efficiency is O (N + M) [N is the limit value, M is the number of times the delimiter appears ],
When limit = 1 or limit = 0, the efficiency is O (1)
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.