1 origin
With regard to PhP, many people intuitively feel that PHP is a flexible scripting language with rich library classes, simple to use, and secure. It is very suitable for web development, but its performance is low. Is PHP performance really as bad as everyone feels? This article focuses on this topic. In-depth analysis of PHP performance problems from the source code, application scenarios, benchmark performance, comparative analysis, and other aspects, through real performance data to identify the key factors affecting the performance of the PHP module.
2. Analysis of PHP performance based on principles
The Principle Analysis of PHP performance mainly involves memory management, variables, functions, operating mechanisms, and network models.
2.1 Memory Management
Similar to nginx memory management, PHP is also based on the memory pool internally and introduces the concept of the lifecycle of the memory pool. In terms of the memory pool, PHP manages PHP scripts and all extended memory-related operations. The management of large and small memory uses different implementation methods and optimizations. For details, refer to the following documents: http://www.laruence.com/2011/11/09/2277.html. During the lifecycle of memory allocation and recovery, PHP uses an initialization application + dynamic expansion + Memory ID collection mechanism, and directly re-creates a mask for the memory pool after each request.
2.2 Variables
PHP is a language of the weak variable type. Therefore, in PHP, all PHP variables correspond to a type of zval, which is defined as follows:
Figure 1. php Variables
In terms of variables, PHP has done a lot of optimization work, such as the reference counting and copy on writer mechanisms. This ensures optimized memory usage and reduces the number of memory copies (see http://blog.xiuwz.com/2011/11/09/php-using-internal-zval ). In terms of arrays, PHP uses efficient hashtable internally.
2.3 Functions
In PHP, all PHP functions are converted into an internal function pointer. For example, extended functions
Zend_function (my_function); // similar to function my_function (){}
After being expanded internally, it will be a function.
Void zif_my_function (internal_function_parameters );
Void zif_my_function (
Int HT,
Zval * return_value,
Zval * this_ptr,
Int return_value_used,
Zend_executor_globals * executor_globals
);
From this perspective, the PHP function also corresponds to a function pointer internally.
2.4 Operating Mechanism
When talking about PhP performance, many people will say that "C/C ++ is a compilation type, Java is a semi-compilation type, and PHP is an interpreted type ". That is to say, PHP performs dynamic parsing before code execution. From this perspective, PHP performance must be poor.
Indeed, the output from the PHP script is indeed a process of dynamic parsing and code execution. Specifically, the PHP script running mechanism is shown in:
Figure 2. php Running Mechanism
PHP is also divided into three stages:
● Parse. Syntax analysis stage.
● Compile. Compile and generate the opcode intermediate code.
● Execute. Running, dynamic running for output.
We can also see that there is a compilation process in PHP itself. In fact, this feature is basically used in standard production environments, such as opcode cache tools APC, eacc, and xcache. Based on opcode cache,PHP scripts are compiled once and run multiple times. From this point on, PHP and Java semi-compilation mechanisms are very similar.
Therefore, from the perspective of the operating mechanism, the PHP running mode is very similar to that of Java, which generates intermediate Codes first and then runs on different virtual machines.
2.5 Dynamic Operation
From the above analysis, PHP has done a lot of work in terms of memory management, variables, functions, and operating mechanisms. Therefore, from the perspective of principle,PHP should not have performance problems, and the performance should at least be close to that of Java..
But why do many people still feel that PHP is slow? In particular, some computational performance comparison, the total found that PHP processing performance is relatively inefficient (http://shootout.alioth.debian.org/u32/php.php ). At this time, we have to talk about the performance problems caused by the characteristics of PHP Dynamic Language. Because PHP is a dynamic runtime, therefore, all variables, functions, object calls, and scope implementations are identified in the execution phase. This fundamentally determines something that is hard to change in PHP performance:Variables and functions that can be determined during the static compilation phase, such as C/C ++, must be determined during dynamic execution in PHP, this determines that PHP intermediate codes cannot be directly run and must be run on Zend.
On the engine.
Speaking of the specific implementation of PHP variables, I have to say another thing: hashtable. Hashtable can be said to be one of the soul of PHP. It is widely used in PHP, including variable symbol stack and function symbol stack. It is based on hashtable.
Take the PHP variable as an example to illustrate the dynamic running characteristics of PHP, such as the Code:
<? PHP
$ Var = "Hello, blog.xiuwz.com ";
?>
The execution result of this Code is to add an item in the variable symbol stack (A hashtable ).
When the variable is used, it is searched in the stack according to the variable (that is, the process of a hash query for the variable call ).
Similarly, for function calls, there is basically a function symbol stack (hashtable ).
In fact, the characteristics of dynamic variable search can also be seen in the PHP operating mechanism. PHP Code is interpreted and compiled in the following process:
Figure 3. php running instance
As you can see, PHP code after compile produces the Class symbol table, function symbol table, and opcode. During actual execution, Zend engine searches and processes the corresponding symbol table based on the op code.
To some extent, it is difficult to find a solution to this problem. This is determined by the dynamic characteristics of the PHP language. However, many domestic and foreign users are looking for solutions. In this way, PHP can be completely optimized. A typical column is Facebook's hiphop (https://github.com/facebook/hiphop-php ).
However, all such compilation optimization solutions sacrifice the dynamic running characteristics of PHP. Of course, some compromise can be made on the Dynamic Features in the specific compilation optimization, but it is difficult to achieve full compatibility.
2.6 Network Model
The current use of PHP, more ideal and general mode is the use of FastCGI (PHP-FPM ). The Network Model of PHP-FPM is similar to nginx and adopts the multi-process master + Multi-worker mode. Php-FPM is based on the epoll model in libevent. From the network model perspective, this method does not have performance differences with other network models.
2.7 conclusion
According to the above analysis, there is no obvious performance difference in the basic memory management, variables, functions, operating mechanisms, and network models. However, due to the dynamic running characteristics of PHP, it determines the CPU overhead and extra memory overhead of all variable search and function running in PHP compared with other compiled languages, it can be obtained through subsequent benchmark performance and comparative analysis.
Therefore, it can be seen that PHP is not suitable for some scenarios: a large number of computing tasks, operations on large data volumes, and applications with strict memory requirements. If you want to implement these functions, you are also recommended to implement them through extension, and then provide the hook function for PHP to call. This reduces the overhead of variable and Function Series in internal computing.
3 benchmark performance
Standard Data is missing for PHP benchmark performance. Most people have a perceptual knowledge. Some people think that QPS is the limit of PHP. In addition, there is no authoritative number to respond to the performance of the framework and the impact of the framework on the performance.
The purpose of this section is to provide a benchmark reference performance indicator and give you an intuitive understanding of the data.
The specific benchmark performance includes the following aspects:
1. Bare PHP performance. Complete basic functions.
2. Performance of the bare framework. Only the simplest route distribution is required and core functions are used.
3. benchmark performance of the standard module. The benchmark performance of a standard module refers to the benchmark performance with complete service module functions.
3.1 Environment Description
Test environment:
Uname-
Linux db-forum-test17.db01.baidu.com 2.6.9 _ 5-7-0-0 #1 SMP Wed Aug 12 17:35:51 CST 2009 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux as Release 4 (nahant Update 3)
8 Intel (r) Xeon (r) CPU e5520 @ 2.27 GHz
Software related:
Nginx:
Nginx version: nginx/0.8.54 built by GCC 3.4.5 20051201 (Red Hat 3.4.5-2)
PhP5: (using PHP-FPM)
PHP 5.2.8 (CLI) (built: Mar 6 2011 17:16:18)
Copyright (c) 1997-2008 the PHP Group
Zend engine v2.2.0, copyright (c) 1998-2008 Zend Technologies
With eaccelerator v0.9.5.3, copyright (c) 2004-2006 eaccelerator, by eaccelerator
Bingo2:
PHP framework.
Other Instructions:
Deployment method of the target machine: script.
The stress testing machine and the target machine are deployed independently.
3.2 bare PHP Performance
The simplest PHP script.
<? PHP
Require_once './actions/indexaction. php ';
$ Objaction = new indexaction ();
$ Objaction-> Init ();
$ Objaction-> execute ();
?>
The code in acitons/indexaction. php is as follows:
<? PHP
Class indexaction
{
Public Function execute ()
{
Echo 'hello, world! ';
}
}
?>
The test results by using the stress tool are as follows:
3.3 bare PHP framework performance
To compare with 3.2, similar functions are implemented based on the bingo2 framework. The Code is as follows:
<? PHP
Require_once 'bingo/controller/front. php ';
$ Objfrontcontroller = bingo_controller_front: getinstance (Array (
'Actiondir' => './actions ',
));
$ Objfrontcontroller-> dispatch ();
?>
The stress test results are as follows:
The test results show that:Although the framework has a certain amount of consumption, it has a very small impact on the overall performance..
3.4 benchmark performance of the standard PHP Module
The so-called standard PHP module refers to the basic functions required by a PHP module:
● Route distribution.
● Automatic loading.
● Log initialization & Notice log printing. Therefore, all the UI requests have a standard log.
● Handle errors.
● Time correction.
● Automatically calculates the time consumption of each stage.
● Encoding recognition & encoding conversion.
● Parsing and calling of standard configuration files
Use the bingo2 code automatic generation tool to generate the standard test PHP module: test.
The test results are as follows:
3.5 conclusion
From the conclusion of the test data, the PHP performance is still acceptable. The benchmark performance can reach thousands or even hundreds of thousands of QPS. As to why the performance in most PHP modules is poor, we should find out the bottleneck of the system at this time, rather than simply saying OK, PHP is not good, so let's change to C. (In the next chapter, we will use some examples to compare and use C to deal with it without special advantages)
The following conclusions can be drawn from the benchmark data:
1. php has good performance. With simple functions5000qps (50 CPU idle), The limit can also exceed W.
2. the PHP framework has very limited impact on performance. Especially when there is a certain amount of business logic and data interaction, it can be ignored.
3. A standard PHP module with benchmark performance2000qps (80 CPU idle).
Performance Comparison and Analysis of 4php and C
Most of the time, when you find that the PHP module has poor performance, you can say "OK, let's rewrite it with C ". In the company, the use of C/C ++ to write business logic modules is everywhere. In the past few years, almost all of them were written using C. At that time, what we wrote was really painful: difficult debugging and agile.
In this section, we will talk about the performance comparison between the business logic written in C and the business logic module written in PHP, and use real data to speak.
4.1 prerequisites
Why is this premise especially true? In ideal cases, a function is implemented in PHP, and the performance cannot be better than that written in ideal C. Pay special attention to this premise.
But why do we need to compare them? In reality, how many excellent C programs can be written and completely high performance can be achieved with frequent modifications? And in real applications, is the performance of C really better than PhP? There is no definite data for demonstration.
Therefore, the comparison in this chapter is based on the actual situation and uses real data to speak.
4.2 Introduction to real business module PHP module vs C Module 4.2.1 Business Module
In a real case column, the traffic of this business module is as high as billions. The architecture of this module is as follows:
Figure 4 business module Architecture
This business module provides simple functions. The upper layer is Web server, and the downstream is each data module. Data interaction is based on socket. The main working model of this business module is:Respond to Web Server requests, read the corresponding data from each backend data module based on the request, and generate the final HTML page based on the data and return it to the web server..
To facilitate the subsequent introduction, define Cui to indicate the modules implemented by C, and phpui to indicate the modules implemented by PHP.
4.2.2c/C ++ module performance data results
In, a new C/C ++ framework was selected for restructuring the module. At the time of reconstruction, the number of backend Data Modules connected by this moduleIn 5-7.
Based on the C/C ++ module, the final test data is divided into two parts:
1. performance comparison test.
Tests the performance of real data based on the current online pressure. Therefore, only one stress data is tested as follows:
Pressure: 210qps
CPU (idle): 84.18
Ii. Ultimate Performance Test 1.
The test model is as follows: Cui connects only one core data module, and other data modules are completely disabled.
Iii. Ultimate Performance Test 2.
The test model is as follows: Cui is connected to a core data module and three data modules at the backend. Other data modules are not connected.
The performance data after the test is as follows:
4.2.3 performance test data of the PHP implementation module
In the past 11 years, the Cui based on the year 09 has basically reached the point where code is not maintained. At this time, the ultimate performance of Cui is no longer enough.600qps(The main reason is that with the development of the project, the number of backend data modules has increased to 14 ). Therefore, we decided to adopt the PHP Solution to rewrite the entire module and generate the final pbui module.
The performance test results are divided into two types:
1. php UI connects to a core module. The test data is as follows:
Figure 5. phpui Performance Test Result 1
2. php UI connects all backend modules (14 ). The test performance data is as follows:
Figure 6. phpui Performance Test Result 2
4.2.4 data comparison conclusion
Because the business logic and test methods of phpui and Cui are not exactly the same, some points of physical comparison are extracted for sorting. The comparison data is as follows:
From the comparison data above,In real business projects, the performance of phpui is not inferior to that of Cui.. This is not simply verified by a single module. In the department, many of our modules are migrated from C/C ++ to PhP. According to the migration results, there is no qualitative performance degradation, and the performance indicators after most modules are migrated are very close.
At this time, we need to think about why it is like this? Segmentation has two problems:
1. Why isn't phpui performance much worse than Cui in real business projects?
2. Why is the benchmark PHP performance so high that QPS can only be QPS in 80 CPUs?
In fact, these two problems can also be attributed to one reason:In real business projects, the performance is more influenced than the language used, but the business-related part, such as the number of socket interactions, such as string processing, or the size of network interaction packets..
OK. The next key is to find out the key factors that affect performance.
4.2.5 key factors affecting PHP module performance
From the previous analysis, we found that the key factor affecting the performance of the front-end PHP module is not the language itself (whether it is PHP/Java/C is not important ). So what are the key factors that affect the performance of the PHP business module? CPU consumption is one of the key points to measure the performance of a project, considering that a series of logs are printed in the system. By analyzing the time consumption distribution of requests in logs, we can see the key points.
In our system, the CPU time consumption focuses on the following aspects:
1. Total Request time.
2. Performance of key functions requested. All socket interactions are time-consuming.
3. template rendering is also a key aspect of good deeds.
In the previous analysis, we basically determined that socket and string processing are one of the key points. We can verify the data. Extract a specified number of logs from a module and perform comprehensive analysis to obtain the following data:
From this we can see that in a business module, the most influential is socket data interaction, followed by a large number of string processing. The specific subdivision includes the following factors: Socket interaction count, socket interaction package size, socket interaction response time, and string processing.
4.2.6 conclusion
Through the above analysis, we can draw the following conclusion: In the front-end business module, the PHP language itself will not become a performance bottleneck. The key factors that affect performance are:
● Number of network interactions.
● The size of network interaction data, including the overhead of data package and unpackage.
● Network interaction response time.
● Massive string processing.
5 final conclusion
Through the specific analysis of the above three chapters, we can draw the following conclusions:
1. From the perspective of PHP implementation principles, PHP is a semi-compiled language, and a lot of optimization work is carried out in all aspects, so there will be no obvious performance problems. However, due to the characteristics of Dynamic Language, PHP needs to run on the Zend engine virtual machine, and requires additional overhead in Variable Search, function call, and scope switch.
2. From the benchmark performance of PHP, PHP itself will not have significant resource consumption, and the single-host QPS will be able to easily surpass W. The PHP framework itself will not have a critical impact on the performance of the business system.
3. In actual application scenarios, modules implemented based on C language are not much more efficient than modules implemented based on PHP. In actual application scenarios, more performance overhead lies in Network Data Interaction and string processing. Slight differences in language performance will not become a bottleneck.
Therefore, it can be launched: most business systems based on the C language can be considered to be migrated to PhP. On the one hand, it can be developed quickly, and on the other hand, performance will not be affected.
Finally, for a detailed analysis of the key factors affecting PHP performance and a comparative analysis of the benchmark performance of language function-level PHP and C, please pay attention to the following "in-depth discussion of PHP performance issues".
6. References
Http://yanbin.org/
Https://wiki.php.net/internals/zend_mm
Http://blog.xiuwz.com/2011/11/09/php-using-internal-zval/
Http://developers.facebook.com/blog/post/358/
Https://github.com/facebook/hiphop-php