Understanding the definition of PHP intrinsic functions

Source: Internet
Author: User
Tags php source code
Welcome to the second part of the "PHP Source for PHP developers" series.

In the previous article, Ircmaxell explained where you can find the source code for PHP, its basic directory structure, and simply introduces some C languages (because PHP is written in C). If you miss the article, maybe you should read it before you start reading this article.

In this article, we are talking about defining the internal functions of PHP and how they are understood.

How to find the definition of a function

As a start, let's try to find out the definition of the Strpos function.

The first step is to go to the PHP 5.4 root directory and enter Strpos in the search box at the top of the page. The result of the search is a large list showing the location of the Strpos in the PHP source code.

Because this result is not very helpful to us, we use a little trick: we search for "php_function strpos" (Don't miss the double quotes, they are important), not the Strpos.

Now we get two entry links:

/php_5_4/ext/standard/    php_string.h   php_function (strpos);    String.c     1789 php_function (Strpos)

The first thing to note is that two locations are in the Ext/standard folder. This is what we want to find because the Strpos function (like most string,array and file functions) is part of the standard extension.

Now open two links on the new tab and see what code is hidden behind them.

You will see the first link to take you to the Php_string.h file, which contains the following code:

// ... Php_function (Strpos); Php_function (Stripos); Php_function (Strrpos); Php_function (Strripos); Php_function (STRRCHR); Php_function (substr);//...

This is the appearance of a typical header file (a file ending with an. h suffix): A simple list of functions, defined elsewhere. In fact, we are not interested in these, because we already know what we are looking for.

The second link is more interesting: it takes us to the string.c file, which contains the actual source code for the function.

Before I take you step-by-step through this function, I recommend that you try to understand this function yourself. This is a very simple function, although you do not know the real details, but most of the code looks very clear.

Skeleton of PHP function

All PHP functions use the same basic structure. The variables are defined at the top of the function, then the Zend_parse_parameters function is called, and then the main logic, which is called by return_*** and Php_error_docref.

So let's start with the definition of the function:

Zval *needle;char *haystack;char *found = Null;char  needle_char[2];long  offset = 0;int   Haystack_len;

The first line defines a pointer to Zval needle. Zval is the definition of any PHP variable that is represented inside PHP. What it really is will be discussed in the next article.

The second line defines a pointer to a single character, Haystack. At this point, you need to remember that in C language, arrays represent pointers to their first elements. For example, the haystack variable will point to the first character of the $haystack string variable that you passed. Haystack + 1 points to the second character, Haystack + 2 points to the third, and so on. Therefore, by incrementing the pointer one by one, you can read the entire string.

So the question is, PHP needs to know where the string ends. Otherwise, it will always increment the pointer without stopping. To solve this problem, PHP also preserves a definite length, which is the Haystack_len variable.

Now, in the definition above, we are interested in the offset variable, which is used to hold the third argument of the function: The offset to start the search. It is defined with a long, which, like int, is an integer data type. Now the difference between the two is not important, but what you need to know is that in PHP, integer values are stored using long, and the length of the string is stored using int.

Now take a look at the following three lines:

if (Zend_parse_parameters (Zend_num_args () tsrmls_cc, "Sz|l", &haystack, &haystack_len, &needle, & Offset) = = FAILURE) {    return;}

What these three lines of code do is get the arguments passed to the function and store them in the variables declared above.

The first argument passed to a function is the number of passed arguments. This number is provided through the Zend_num_args () macro.

The next function is a TSRMLS_CC macro, which is a feature of PHP. You will find this strange macro scattered in many parts of the PHP code base. is part of the Thread Safety Resource Manager (TSRM), which guarantees that PHP does not clutter variables between threads. This is not very important to us, and when you see TSRMLS_CC (or TSRMLS_DC) in your code, you can ignore it. (There is a strange place to note that there is no comma before "argument".) This is because whether or not you use thread-safe to create a function, the macro is interpreted as empty or Trsm_ls. Therefore, commas are part of the macro. )

Now, we come to the important thing: the "sz|l" string marks the parameters that the function receives. :

s  //The first parameter is the string z  //The second parameter is a zval struct, any variable |  Identifies the next parameter is optional L  //The third parameter is a long type (integer)

In addition to s,z,l, there are more types of identities, but most of them can be clearly understood from the characters. For example, B is boolean,d is double (floating-point number), A is array,f is callback (function), O is object.

The next parameter, &haystack,&haystack_len,&needle,&offset, specifies the variable that requires the parameter to be assigned. As you can see, they are all passed by reference (&), meaning that they are not passing the variables themselves, but rather pointing to their pointers.

After this function call, haystack will contain the haystack string, Haystack_len is the length of the string, needle is the value of needle, and offset is the starting offsets.

Furthermore, this function uses failure (which occurs when you try to pass an invalid argument to a function, such as passing an array assignment to a string) to check. In this case, the Zend_parse_parameters function throws a warning, and this function returns immediately (returning NULL to the user layer Code of PHP).

After the parameter parsing is complete, the main function body begins:

if (Offset < 0 | | offset > Haystack_len) {    php_error_docref (NULL tsrmls_cc, e_warning, "offset not contained in String ");    Return_false;}

What this code does is obvious, if offset is out of bounds, a e_warning level error is thrown through the Php_error_docref function, and the function returns false using the Return_false macro.

Php_error_docref is a wrong function, you can find it in the extension directory (for example, ext folder). Its name is defined by its return to the document reference in the error page (which is the function that does not work properly). There is also a zend_error function, which is mainly used by Zend engine, but it is also often seen in the extension code.

All two functions use the sprintf function, such as formatting information, so the error message can contain placeholders, which are populated with subsequent arguments. Here's an example:

Php_error_docref (NULL tsrmls_cc, e_warning, "Failed to write%d bytes to%s", Z_STRLEN_PP (TMP), filename);//%d is filled With Z_STRLEN_PP (TMP)//%s is filled with filename

Let's continue to parse the code:

if (z_type_p (needle) = = is_string) {    if (! Z_strlen_p (needle)) {        php_error_docref (NULL tsrmls_cc, e_warning, "Empty delimiter");        Return_false;    }    Found = PHP_MEMNSTR (haystack + offset,                        z_strval_p (needle),                        z_strlen_p (needle),                        haystack + haystack_len);}

The previous 5 lines are very clear: this branch will only execute if needle is a string, and will throw an error if it is empty. And then to the more interesting part: Php_memnstr was called, and this function did the main work. As always, you can click the function name and view its source code.

PHP_MEMNSTR returns a pointer to needle where the haystack first appears (this is why the found variable is defined as char *, for example, a pointer to a character). As you can see from here, the offset can be simply computed by subtraction and can be seen at the end of the function:

Return_long (Found-haystack);

Finally, let's take a look at the branch when needle as a non-string:

else {    if (Php_needle_char (needle, Needle_char tsrmls_cc)! = SUCCESS) {        return_false;    }    NEEDLE_CHAR[1] = 0;    Found = PHP_MEMNSTR (haystack + offset,                        Needle_char,                        1,                        haystack + haystack_len);}

I only refer to the manual that says "If needle is not a string, it will be converted to an integer and treated as a character order value." "This basically shows that in addition to writing Strpos ($str, ' a '), you can also write Strpos ($STR, 65), because the A-character encoding is 65.

If you look at the definition of the variable again, you can see that Needle_char is defined as Char needle_char[2], which is a two-character string, and Php_needle_char will have the real character (here is ' A ') to needle_char[0]. The Strpos function then sets needle_char[1] to 0. The reason behind this is because, in C, the string is the end, that is, the last character is set to NUL (the character encoded as 0). In PHP's syntax environment, such a situation does not exist, because PHP stores all the length of the string (so it does not need to help find the end of the string), but in order to ensure compatibility with the C function, or in the internal implementation of PHP.

Zend functions

I'm tired of strpos this function, let's find another function: strlen. We used the previous method:

Start searching for strlen from the PHP5.4 source root directory.

You will see a bunch of unrelated functions used, so search for "php_function strlen". When you do this search, you will find something strange happen: no results.

The reason is that strlen is a small number of functions defined by Zend engine and not by the PHP extension. In this case, the function is not defined using Php_function (strlen), but rather zend_function (strlen). Therefore, we also want to search "Zend_function strlen".

We all know that we need to click the link without the end of the semicolon to jump to the definition of the source code. This link takes us to the following function definition:

Zend_function (strlen) {    char *s1;    int S1_len;    if (Zend_parse_parameters (Zend_num_args () tsrmls_cc, "s", &s1, &s1_len) = = FAILURE) {        return;    }    Retval_long (S1_len);}

The implementation of this function is too simple, I do not think I need further explanation.

Method

We'll talk about how classes and objects work in more detail in other articles, but as a little spoiler: you can search for object methods by searching Classname::methodname in the search box. For example, try searching for splfixedarray::getsize.

Next section

The next section will be published again. Will talk about what Zval is, how they work, and how they are used in the source (all z_*** macros).

Support me to translate more good articles, thank you!

Reward the translator

Support me to translate more good articles, thank you!

Choose a payment method

About the Author: hoohack

A rookie who's trying to make a personal homepage · My article · 15 ·

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.