Article from: http://www.aintnot.com/2016/02/04/phps-source-code-for-php-developers-ch
Original: http://blog.ircmaxell.com/2012/03/phps-source-code-for-php-developers.html
As a developer, I find it more and more in my daily work to see the source code of PHP. It is useful to understand what is going on behind the scenes in order to figure out the strange boundary problems and why some problems should have happened. It is also useful if the document is missing, incomplete, or incorrect. Therefore, I have decided to share my knowledge through a series of articles, giving PHP developers enough knowledge to actually read PHP's C-language source code. You don't need to have a C-language basis (we'll summarize some basics), but it would be helpful if there was one.
This is the first article in this series. In this article, we'll talk about the basics of PHP programs: Where to find it, basic code structure and some of the most basic C language concepts. It should be stated that the goal of this series of articles is to gain the reading comprehension ability of the source code. This means that in order to cross some points, some concepts are simplified rather than too complex to describe. This will not make a noticeable difference in reading, but if you want to contribute to the source, there is more to be added. As I make the simplification, I will try to point out these simplifications.
In addition, this series of articles is based on the 5.4 version of the source code, in different versions, most of the concepts are the same, but here, we need a version of this article is defined (in order to make the new version after the next article easier to follow).
So, can we get started?
Where to find the source code for PHP
The simplest way to download PHP source code is through the SVN repository of PHP. For this article, we check out the branch of 5.4 (check out). This is great for being at the forefront of PHP or really developing PHP (solving bugs, implementing features, and so on). It is important to note that the PHP community is migrating the source code to a git repository (this article is being written). Once the migration is complete, I will update this article to meet the standards. (Translator: PHP has migrated to git repository when translator is translating).
In fact, downloading the source code is not really useful for our purposes. We don't want to edit it, we just want to use it and keep track of how it works. We can download it and then import it into a good IDE, in which we can click to jump to the definition and declaration of the function, and when I find this is slightly more difficult than imagined. I have a better solution.
It turns out that the PHP community is maintaining a very good tool for us. That's lxr.php.net. This is mainly a list of automatically generated searchable source code, and there are syntax highlighting and functions are all linked. This is the only tool I use to browse the C source, which is great (even when I write patches, I still go to LXR instead of the code base I'm developing). We're not going to talk about how to do a more efficient search, but we're talking about PHP core functions.
Starting from here, we'll start talking about PHP5.4. To achieve this, we use this LXR link as the basis for other articles. When I mentioned the "root of 5.4", I was talking about this page.
So, since we can look at the source directory, let's talk about what's in here.
PHP Source code structure
Well, when you look at the files and directories that are listed in the root directory of 5.4, there's a lot more to look at. I hope you only focus on two categories: ext and Zend. Other files and directories are important for PHP extensions and development, but for our purposes we can completely ignore them. So why are these two directories so important?
PHP program is divided into, you guessed right, two main parts. The first part is the Zend engine, which controls the runtime environment of PHP code. It handles all the "language layer" features provided by PHP, including variables, expressions, parsing, code execution, and error handling. Without this engine, there would be no PHP. The source of the engine is stacked in the Zend directory.
The second core part of PHP is the extension contained within PHP. These extensions include every core function we can invoke in PHP (such as Strpos,substr,array_diff,mysql_connect , etc.). Also includes the core classes (mysqli,splfixedarray,PDO , etc.).
In the core code, the easiest way to decide where to find the functionality you want to see is to view the first page of the PHP document. PHP's documentation is also divided into two main sections (for our purposes), a language reference and a function reference. As a huge generalization, if you want to see the definitions in the language reference, it is likely that you can find them in the Zend folder. If it is in the function reference, it can be found in the Ext folder.
Some basic C-language concepts
This part is not intended to be an introduction to C, but a "companion guide" for readers. There are the following concepts:
Variable
In C, variables are static and strongly typed. This means that a variable must use a type definition before it can be used. Once defined, you cannot change its type (you can convert it later to another type, but you need to use a different variable to implement it). Because, in the C language, variables don't really exist. They are just for the label of the convenient memory address we use. Because of this, the C language does not have a reference in PHP. Instead, it has pointers. For our purposes, the pointer imagines exponentially to the variables of the other variables. Think of it as a variable in PHP.
So, with the above description, let's talk about the syntax of the variable. The C language does not use any prefixes to identify variables. Therefore, the only way to say their differences (for our purposes) is to look at their definitions. If you see the character after the type and the space at the top of the function (or the function's declaration), that is the variable. One key point to note is that the variable name can have one or more symbols in front of it. An asterisk (*) indicates that a variable is a pointer to a type (a reference). A two asterisk indicates that the variable is a pointer to a pointer. A three asterisk indicates that the variable is a pointer to a different pointer.
This indirection is important because PHP uses a lot of double pointers inside. This is because the engine needs to be able to pass block data (PHP variables), and all interesting types such as PHP references, copy-on-Write and object references, and so on. So, just realize that **ptr means we're using a two-layer reference (not a reference to a variable, but a reference to a data reference). This is a little confusing, but if the reference is completely new to you, I suggest you read this knowledge (although our goal is not to read C). Will help.
Now, another thing that understands pointers is how they are applied in an array of C (not an array of PHP, but an array in the C language). Because the pointer is a memory address, we can define an array by allocating a chunk of memory, and then iterate through it by incrementing the pointer. Normally, we can use the data type char, which represents a character (8-bit) of C, to store one character in a string. But we can also use it as an array to access the bytes that follow the string. Therefore, we can store only one pointer in the first byte instead of storing a string in the variable. We can then increment the pointer (increasing its memory address) to traverse the entire string.
char *foo = "test"; // foo 是指向"t"在内存的片段保存"test"的指针 // 要访问"e",我们可以通过下面的方式: char e = foo[1]; char e = *(foo + 1); char e = *(++foo);
To read the variables and pointers in the C language focus, check out this very good free book.
Preprocessing instructions
C Use a step called "preprocessing" before compiling. This step includes optimizing and dynamically using part of the code based on the options you pass to the compiler. We will talk about the two main preprocessor descriptions: conditional statements and macros.
Conditional statements allow code to be introduced when compiling output or not based on definition. This looks much like the following example. This allows different code to be used according to different operating systems (so even though they use different APIs, they can be used well in Windows and Linux). In addition, it allows a subset of code to be introduced or not based on defined instructions. In fact, this is the process of how to compile PHP in the configuration step.
#define FOO 1 #< Span class= "Hljs-keyword" >if foo foo is defined and not 0 #else Foo is not defined or is 0 # endif #ifdef foo foo is defined #else Foo is not defined #endif
Another explanation I called it a macro. This is the simplest mini-function that simplifies the code. They are not real functions, but a simple text substitution is performed when the preprocessor is compiled. Therefore, the macro does not actually call the function. You can write a macro for the function definition (in fact, that's what PHP does, but we'll dig into that in a later article). What I want to say is that macros allow simpler code to be used at pre-processing compile time.
#define FOO(a) ((a) + 1) int b = FOO(1); // Converted to int b = 1 + 1
source file
In this last section, we need to understand the two types of files used in C source code. There are two main types of files:. C and. H. c files that contain the files that the source prepares to compile. Typically,. c files contain implementations of private functions that cannot be shared to other files: H (or header files) defines functions that can be seen in. c files, including preprocessing macros. The way a header file defines a public API is to re-declare the signature of the function without using the body of the function (similar to the interface in PHP and the abstract method). In this way, the source code can be linked together through the header file.
Next section
In the next installment of this series, we're going to discuss how internal functions are defined in C. So you can jump to any intrinsic function (like strlen) to see how it's defined and how it works. Keep this rhythm.
PHP Source for PHP Developers-Part I-Source structure