With the privilege of participating in the 2015 PHP Technology Summit (Phpcon), it was exciting to hear about Xinchen's new features and performance optimizations for PHP7. Bird Brother is the most authoritative PHP experts, his share has a lot of very valuable things, I organize and share the PPT and collect relevant information, collation for this interpretation of the nature of the technical article, I hope to do PHP development of the students some help.
PHP has gone through 20 years of history, until today, PHP7 has released the RC version, it is said that the PHP7 official version should be released around November 2015. PHP7 for the previous series of php5.*, can be said to be a large-scale innovation, especially in the performance of the leap-forward to achieve a significant increase.
PHP is a widely used web development language around the world, and PHP7 's innovations will certainly bring deeper changes to these Web services. Here is a chart of the bird's PPT (82% of the Web site has PHP as the development language):
(Note: A Web site can use multiple languages as its development language)
(Note: This article contains many from the bird brother ppt, the picture copyright belongs to the bird elder brother all)
Let's take a look at two exciting performance test result graphs:
Benchmark comparison (image from PPT):
PHP7 performance Test results, performance pressure measurement results, time-consuming from 2.991 to 1.186, a significant decrease of 60%.
WordPress's QPS pressure test (image from PPT):
In the WordPress project, PHP7 contrast php5.6,qps 2.77 times times.
After reading the exciting performance test results, we went to the chase. The new features of PHP7 are many, but we will focus more on those major changes.
First, new features and changes
1. Scalar types and return type declarations (scalar type Declarations & scalar type Declarations)
A very important feature of the PHP language is the "weak type", which makes the PHP program very easy to write, novice contact with PHP can quickly get started, but it is accompanied by some controversy. The definition of supporting variable types can be said to be a change in the nature of innovation, and PHP begins to support type definitions in an optional way. In addition, a switch instruction declare (strict_type=1) is introduced, and when this command is turned on, the program under the current file will be forced to follow the strict function parameter type and return type.
For example, an add function plus a type definition can be written like this:
If you are cooperating with a type switch directive, you can change to this:
If you do not turn on strict_type,php will try to help you convert to the required type, and after opening, will change PHP will not do type conversion, type mismatch will throw an error. This is a great boon for students who like the "strong-typing" language.
A more detailed introduction:
PHP7 scalar type declaration rfc[translation]
2. More error becomes a exception to capture
PHP7 implements a global Throwable interface, the original exception and partial error both implement this interface (interface) and define the inheritance structure of the exception in the way of the interface. As a result, more error in PHP7 becomes a exception that can be captured and returned to the developer, error if not captured, and if the capture becomes a exception that can be processed within the program. These catch-all error is usually an error that does not cause fatal damage to the program, such as a function that does not exist. PHP7 is further facilitated by developers, allowing developers to be more control of their programs. Because by default, error directly causes a program to break, and PHP7 provides the ability to capture and process, allowing the program to continue to execute, giving programmers a more flexible choice.
For example, to perform a function that we are unsure of exists, the PHP5-compliant approach is to append the judgment function_exist before the function is called, while PHP7 supports capturing exception processing.
As in the example (from within the PPT):
3. AST (abstract Syntax tree)
AST in the PHP compilation process as a middleware role, replace the original directly from the interpreter spit opcode way, the Interpreter (parser) and the compiler (Compliler) decoupling, you can reduce some hack code, while making the implementation easier to understand and maintainable.
PHP5:
PHP7:
More AST information:
Https://wiki.php.net/rfc/abstract_syntax_tree
4. Native TLS (Native thread local storage, native thread native storage)
PHP in multithreaded mode (for example, Web server Apache Woker and event mode, is multithreaded), need to solve the "thread safety" (ts,thread safe) problem, because the thread is the memory space of the shared process, so each thread itself needs to be in some way, Build a private space to store your private data and avoid contamination with other threads. The PHP5 approach is to maintain a large global array that allocates a separate storage space for each thread, and the thread accesses the global data set by its own key value.
This unique key value in PHP5 needs to be passed to every function that needs to use global variables, PHP7 that the way it is passed is not friendly and has some problems. Therefore, try to use a global thread-specific variable to hold this key value.
Related native TLS issues:
Https://wiki.php.net/rfc/native-tls
5. Other new features
PHP7 a lot of new features and changes, we do not fully expand here to elaborate Kazakhstan.
(1) Int64 support, unified the length of the integer under different platforms, string and file upload support is greater than 2GB.
(2) Uniform variable syntax (Uniform variable syntax).
(3) foreach behaves consistently (consistently foreach behaviors)
(4) new operator <=>,??
(5) Unicode Character format support (\u{xxxxx})
(6) Anonymous type support (Anonymous Class)
... ...
Second, the leap-forward performance breakthrough: Full Speed ahead
1. JIT and performance
Just in Time (instant compilation) is a software optimization technique that compiles bytecode into machine code at runtime. From intuition, it is easy to think that the machine code is directly recognized and executed by the computer, which is more efficient than zend read opcode-by-article execution. Among them, HHVM (HipHop virtual MACHINE,HHVM is a Facebook open-source PHP virtual machine) to use the JIT, so that their PHP performance test to improve an order of magnitude, releasing a shocking test results, It also makes us intuitively think that JIT is a powerful technology with Midas touch.
In fact, in 2013, Brother Bird and Dmitry (one of the PHP language kernel developers) had a JIT attempt on the PHP5.5 version (not published). PHP5.5 's original execution process, is the PHP code through lexical and syntactic analysis, compiled into opcode bytecode (format and assembly a bit like), and then, Zend Engine read these opcode instructions, one by one parse execution.
They introduce type inference (typeinf) after the opcode link, and then generate Bytecodes by JIT and then execute.
Thus, in benchmark (test program) to get exciting results, the implementation of JIT performance than the PHP5.5 increased by 8 times times. However, when they put this optimization into the actual project WordPress (an open source blog project), the performance improvements were almost invisible, resulting in a puzzling test result.
Therefore, they use the profile type tool under Linux to perform CPU time-consuming analysis on program execution.
Perform a 100-time WordPress CPU consumption distribution (from PPT):
Annotations:
21%cpu time spent in memory management.
12%cpu time spent in hash table operation, is mainly PHP array additions and deletions to check.
30%cpu time is spent on built-in functions, such as strlen.
25%CPU time spent on VMS (Zend engine).
After analysis, two conclusions were obtained:
(1) JIT-generated bytecodes if too large, will cause the CPU cache Hit rate drop (CPU cache Miss)
In PHP5.5 's code, because there is no explicit type definition, it can only be inferred by type. If possible, define the type of variable that can be inferred, and then, with type inference, remove the branch code that is not of that type and generate the machine code that executes directly. However, type inference cannot infer all types, in WordPress, the type information can be inferred only less than 30%, can reduce the branch code is limited. Lead to JIT, directly generated machine code, generated bytecodes too large, resulting in a significant decrease in CPU cache hit (CPU caches Miss).
CPU cache hit refers to the CPU in the process of reading and executing instructions, if the required data in the CPU-level cache (L1) is not read, it will have to continue to look down to the level two cache (L2) and level three cache (L3), will eventually try to the memory area to find the required command data, The read-time gap between memory and CPU cache can reach 100 times times the level. Therefore, if the bytecodes is too large and the number of execution instructions is too high, so that the multilevel cache cannot accommodate so much data, some instructions will have to be stored in the memory area.
The size of the CPU cache at all levels is also limited and is the configuration information for Intel i7 920:
As a result, the decrease in CPU cache hit ratios can lead to significant time-consuming increases, and on the other hand, JIT performance gains are offset by it.
With JIT, you can reduce the overhead of the VM, and with instruction optimization, you can indirectly reduce the development of memory management because you can reduce the number of memory allocations. However, for real WordPress projects, the CPU takes only 25% of the time on the VM, and the main problems and bottlenecks are not actually on the VM. Therefore, the JIT optimization plan was not included in the PHP7 feature of the version. However, it is likely to be implemented in a later version, which is also worthy of our expectation.
(2) The improvement of JIT performance depends on the actual bottleneck of the project.
The JIT has a significant increase in benchmark because the code is relatively small and the resulting bytecodes is smaller, while the main overhead is in the VM. But the application in the actual WordPress project does not have the obvious performance promotion, the reason WordPress code quantity is much bigger than benchmark, although the JIT reduces the overhead of the VM, but because the bytecodes is too big and causes the CPU cache hit drops and the extra memory overhead, Eventually it becomes no ascension.
Different types of projects will have different CPU cost ratios, and will get different results, out of the actual project performance testing, not very well representative.
2. Changes in Zval
PHP's various types of variables, in fact, the real storage of the carrier is Zval, which is characterized by the sea of the hundred rivers, the capacity is large. Essentially, it is a struct (struct) implemented by the C language. For a classmate who writes PHP, it can be roughly understood to be something like an array of arrays.
PHP5 Zval, Memory occupies 24 bytes (from ppt):
PHP7 Zval, Memory occupies 16 bytes (from PPT):
Zval from 24 bytes down to 16 bytes, why would fall, here need to fill a little bit of C language Foundation, auxiliary unfamiliar C classmate understand. Structs and unions are a bit different, each member variable of a struct has to occupy a separate memory space, and the member variable in the union is a shared memory space (that is, modifying one of the member variables, the public space is modified, The record of the other member variables is gone). Therefore, although the member variable looks a lot more, but the actual occupied memory space is decreased.
In addition, there are features that are significantly changed, and some simple types no longer use references.
Zval structure diagram (from ppt):
The figure of Zval is composed of 2 64bits (1 bytes =8bit,bit is "bit"), if the variable type is long, bealoon these lengths do not exceed 64bit, then directly stored in value, there is no reference to the following. When the variable type is more than 64bit, such as array, Objec, string, and so on, value stores a pointer to the real storage structure address.
For simple variable types, zval storage becomes very simple and efficient.
Types that do not need to be referenced: NULL, Boolean, Long, Double
Types that need to be referenced: String, Array, Object, Resource, Reference
3. Internal type zend_string
Zend_string is the actual structure that stores the string, the actual content is stored in Val (char, character), and Val is a char array with a length of 1 (convenient member variable placeholder).
struct the last member variable takes a char array instead of using char*, here is a small optimization technique that can reduce the CPU cache miss.
If you use a char array, when malloc applies to the above structure, it is applied in the same area, usually the length is sizeof (_zend_string) + the actual char storage space. However, if you use char*, that location stores just one pointer, and the real storage is in a separate area of memory.
Comparison of memory allocations using CHAR[1] and char*:
From the point of view of logical implementation, the two actually do not have much difference, the effect is very similar. In fact, when these memory blocks are loaded into the CPU, it is very different. The former because it is a contiguous allocation of the same piece of memory, when the CPU reads, usually can be obtained together (because it will be in the same level cache). While the latter, because it is two pieces of memory data, when the CPU reads the first memory, it is likely that the second block of memory data is not in the same cache, so that the CPU has to L2 (level two cache) below the search, and even to the memory area to find the second block of memory data. This will cause CPU Cache Miss, which can take up to 100 times times the time difference.
In addition, when the string is copied, the reference assignment is used to zend_string the memory copy that can be avoided.
6. Changes to the PHP array (hashtable and Zend Array)
In the process of writing PHP programs, the most frequently used types are arrays, and PHP5 arrays are implemented using Hashtable. If it is a rough generalization, it is a hashtable that supports doubly linked lists, which not only supports the hash map access element through the array key, but also iterates through the array elements in a way that provides access to the doubly linked list through foreach.
PHP5 's Hashtable (from PPT):
The diagram looks very complex, and various pointers jump and jump, and when we access an element's content through a key value, sometimes it takes 3 of a pointer jump to find what's needed. The most important thing is that these array element stores are scattered across different memory areas. Similarly, when the CPU reads, because they are most likely not in the same level cache, the CPU will have to go to the lower cache or even the memory area to find, that is, to cause the CPU cache hit down, which adds more time-consuming.
PHP7 Zend Array (from ppt):
The new version of the array structure, very concise, let a person in front of the light. The most important feature is that the entire array element and hash mapping table are all connected together and are allocated in the same piece of memory. It is very efficient to iterate through an array of simple types of an integral type, because the array element (Bucket) itself is continuously allocated in the same piece of memory, and the zval of the elements of the array will store the integral element inside, no longer have the pointer outside the chain, and all the data is stored in the current memory area. Of course, the most important thing is that it avoids CPU cache Miss (the CPU buffer hit rate drops).
Zend Array Changes:
(1) The value of the array defaults to Zval.
(2) The size of the Hashtable drops from 72 to 56 bytes, reducing 22%.
(3) The size of the buckets drops from 72 to 32 bytes, reducing 50%.
(4) The buckets memory space of the array elements is allocated together.
(5) The key of the array element (Bucket.key) points to zend_string.
(6) The value of the array element is embedded in the bucket.
(7) Reduce CPU Cache Miss.
7. Function call mechanism (functions calling convention)
PHP7 improves the function's calling mechanism, and by optimizing the transfer of parameters, some instructions are reduced and the execution efficiency is improved.
PHP5 function call mechanism (from PPT):
The instructions in the Send_val and recv parameters in the VM stack are the same, PHP7 by reducing these two repetitions to achieve the underlying optimization of the function invocation mechanism.
PHP7 function call mechanism (from PPT):
8. Let the compiler do some work ahead of time by macro definition and inline function (inline)
C language macro definition will be in the pre-processing phase (compile phase) execution, the part of the work in advance, do not need to allocate memory when the program runs, can implement functions like functions, but there is no function call of the stack, the cost of the stack, the efficiency will be relatively high. Inline functions are similar, in the preprocessing phase, the functions in the program are replaced with the function body, the real-running program executes here, there is no cost of function calls.
PHP7 has done a lot of optimizations in this area, putting a lot of work that needs to be done during the run phase to the compile stage. For example, the parameter type of the judgment (Parameters parsing), because this is involved in a fixed character constants, so it can be put into the compilation phase to complete, and thus improve the subsequent execution efficiency.
For example, the way to handle passing parameter types, from the left-hand side, is optimized to the right macro.
Third, summary
Bird Brother's PPT released a set of comparative data, that is, WordPress in PHP5.6 execution 100 times will produce 7 billion CPU instruction execution number, and in PHP7 only 2.5 billion times, reduce 64.2%, this is a shocking data.
In the whole bird brother's share, give me the most profound point is: to pay attention to the details, many small optimization, a little bit of continuous accumulation, add up, and finally converge to the amazing results. For the nine Ren, I think it is probably the same reason.
There is no doubt that PHP7 has achieved a leap in performance, and if it can be applied to PHP's web system, perhaps we need fewer machines to support higher-volume services. PHP7 the release of the official version, it is full of infinite vision.
References & citations:
Brother Bird (xinchen) share ppt,http://www.laruence.com/
PHP Official community, http://php.net/
Thanks:
Thanks to Brother Bird (Xinchen) for their help and support.
Original: http://hansionxu.blog.163.com/blog/static/24169810920158704014772
PHP7 Innovation and performance optimization