1. Origin
Regarding PHP, many people’s intuitive feeling is that PHP is a flexible scripting language with rich libraries, easy to use, safe, and very suitable for WEB development, but its performance is low. Is the performance of PHP really as bad as everyone feels? This article focuses on such a topic. In-depth analysis of PHP performance issues from source code, application scenarios, benchmark performance, comparative analysis and other aspects, and speak through real data.
2. Analyze PHP performance from the principle
To analyze the performance of PHP from the principle, mainly from the following aspects: memory management, variables, functions, and operating mechanism.
2.1 Memory Management
Similar to Nginx’s memory management method, PHP is also based on the memory pool internally, and introduces the life cycle concept of the memory pool. In terms of memory pools, PHP manages all memory-related operations of PHP scripts and extensions. Different implementation methods and optimizations are used to manage large memory and small memory. For details, please refer to the following document: https://wiki.php.net/internals/zend_mm. During the life cycle of memory allocation and recycling, PHP uses an initialization application + dynamic expansion + memory identification recycling mechanism, and directly re-masks the memory pool after each request.
2.2 Variables
As we all know, PHP is a weak variable type language, so within PHP, all PHP variables correspond to a type Zval, which is specifically defined as follows:
Figure 1 PHP variables
In terms of variables, PHP has done a lot of optimization work, such as Reference counting and copy on writer mechanisms. This can ensure the optimization of memory usage and reduce the number of memory copies (please refer to http://blog.xiuwz.com/2011/11/09 /php-using-internal-zval/). In terms of arrays, PHP uses an efficient hashtable internally to implement it.
2.3 Function
Within PHP, all PHP functions are converted into an internal function pointer. For example, the expanding function
ZEND_FUNCTION (my_function);//Similar function my_function(){}
After internal expansion, it will be a function
void zif_my_function (INTERNAL_FUNCTION_PARAMETERS);
void zif_my_function(
int ht,
zval * return_value,
zval * this_ptr,
int return_value_used,
zend_executor_globals * executor_globals
);
From this perspective, PHP functions also correspond to a function pointer internally.
2.4 Operating Mechanism
When talking about PHP performance, many people will say "C/C++ is compiled, JAVA is semi-compiled, and PHP is interpreted." In other words, PHP is dynamically parsed first and then the code is run, so from this perspective, PHP performance must be very poor.
Indeed, running the PHP script to output is indeed a process of dynamic parsing and then code running. Specifically, the running mechanism of the PHP script is as shown below:
Figure 2 PHP operating mechanism
The running phase of PHP is also divided into three phases:
Parse. Syntax analysis stage.
Compile. Compile and output opcode intermediate code.
Execute. Run, run dynamically for output.
So, within PHP, there is also a compilation process. And based on this, a large number of opcode cache tools have been produced, such as apc, eacc, xcache, etc. These opcode caches are basically standard in production environments. Based on opcode cache, it can achieve the effect of "compile PHP script once and run it multiple times". From this point on, PHP is very similar to JAVA's semi-compiled mechanism.
Therefore, from the perspective of operating mechanism, the operating mode of PHP is very similar to that of JAVA. Both intermediate codes are generated first and then run on different virtual machines.
2.5 dynamic operation
Judging from the above analysis, PHP has done a lot of work in memory management, variables, functions, operating mechanisms, etc., so from a principle point of view, PHP There should be no performance issues, and the performance should be at least close to Java.
At this time, we have to talk about the performance issues caused by the characteristics of PHP's dynamic language. Since PHP is a dynamic runtime, all variables, functions, object calls, scope implementation, etc. are in the execution phase. Just confirmed. This fundamentally determines some things that are difficult to change in PHP performance: variables and functions that can be determined in the static compilation stage in C/C++ need to be determined in dynamic running in PHP, which also determines that PHP intermediate code cannot Running directly requires running on Zend Engine.
When it comes to the specific implementation of PHP variables, we have to talk about one more thing: Hashtable. Hashtable can be said to be one of the souls of PHP and is widely used within PHP, including variable symbol stacks, function symbol stacks, etc., all based on hashtable.
Take PHP variables as an example to illustrate the dynamic operation characteristics of PHP. For example, the code:
$var = “hello, blog.xiuwz.com”;
?>
The execution result of this code is to add an item to the variable symbol stack (which is a hashtable)
When you want to use the variable, go to the variable stack to search (that is, the variable call creates a hash search process).
Similarly, for function calls, there is basically a function symbol stack (hashtable).
In fact, some of the variable search characteristics of dynamic operation can also be seen in the operating mechanism of PHP. The flow of PHP code after interpretation and compilation is as follows:
Figure 3 PHP running example
As you can see from the picture above, after the PHP code is compiled, it produces a class symbol table, a function symbol table, and an OPCODE. During actual execution, zend Engine will search and process in the corresponding symbol table according to the op code.
To some extent, it is difficult to find a solution to this problem. Because this is determined by the dynamic characteristics of the PHP language. But there are also many people at home and abroad looking for solutions. Because through this, PHP can be fundamentally and completely optimized. A typical example is Facebook's hiphop (https://github.com/facebook/hiphop-php).
2.6 Conclusion
Judging from the above analysis, there is no obvious performance difference between PHP itself in terms of basic memory management, variables, functions, and operating mechanisms. However, due to the dynamic operating characteristics of PHP, PHP and other compiled Compared with other languages, all variable searches, function runs, etc. will have more CPU overhead and additional memory overhead for hash searches. As for the specific amount of this overhead, it can be determined through subsequent benchmark performance and comparative analysis.
Therefore, we can also roughly see some scenarios that PHP is not suitable for: a large number of computational tasks, large amounts of data operations, and application scenarios with strict memory requirements. If you want to implement these functions, it is also recommended to implement them through extensions, and then provide hook functions for PHP calls. This can reduce the overhead of internally calculated variables, functions, etc.
3. Baseline performance
For PHP benchmark performance, there is currently a lack of standard data. Most students have a perceptual understanding. Some people think that 800QPS is the limit of PHP. Furthermore, there are very few authoritative numbers on the performance of frameworks and the impact of frameworks on performance.
The purpose of this chapter is to give a benchmark reference performance indicator and give everyone an intuitive understanding through the data.
Specific benchmark performance includes the following aspects:
1. Naked PHP performance. Complete basic functions.
2. Performance of bare framework. Only do the simplest route distribution and only pass through the core functions.
3. Baseline performance of standard modules. The so-called benchmark performance of the standard module refers to the benchmark performance of a complete service module function.
3.1 Environment Description
Test environment:
Uname -aPnux db-forum-test17.db01.baidu.com 2.6.9_5-7-0-0 #1 SMP Wed Aug 12 17:35:51 CST 2009 x86_64 x86_64 x86_64 GNU/ Pnux
|
软件相关:
Nginx:nginx version: nginx/0.8.54 built by gcc 3.4.5 20051201 (Red Hat 3.4.5-2)
|
PHP framework.
Other instructions:
Target machine deployment method: nginx->php-fpm->php script.
Test pressure machines and target machines are deployed independently.
3.2 Naked PHP Performance
The simplest PHP script.
require_once ‘./actions/indexAction.php’;
$objAction = new indexAction();
$objAction->init();
$objAction->execute();
?>
The code in Acitons/indexAction.php is as follows
class indexAction
{
pubPc function execute()
{
echo ‘hello, world!’;
}
}
?>
The results of the pressure tool test are as follows:
3.3 Naked PHP Framework Performance
In order to compare with 3.2, similar functions are implemented based on the bingo2 framework. The code is as follows
require_once ‘Bingo/Controller/Front.php’;
$objFrontController = Bingo_Controller_Front::getInstance(array(
‘actionDir’ => ‘./actions’,
));
$objFrontController->dispatch();
The stress test results are as follows:
It can be seen from the test results: Although the framework consumes a certain amount of money, the impact on the overall performance is very small.
Benchmark performance of 3.4 standard PHP module
The so-called standard PHP module refers to the specific basic functions that a PHP module must have:
Route distribution.
Load automatically.
LOG initialization & Notice log printing. All UI requests have a standard log.
Error handling.
Time correction.
Automatically calculate the time-consuming overhead of each stage.
Code identification & code conversion.
Parsing and calling of standard configuration files
Using bingo2’s automatic code generation tool to generate a standard test PHP module: test.
The test results are as follows:
3.5 Conclusion
Judging from the conclusion of the test data, the performance of PHP itself is still acceptable. The baseline performance can reach thousands or even W of QPS. As for why most PHP modules perform poorly, in fact, at this time, we should find out the bottleneck of the system. Instead, simply say OK, PHP is not good, then let's switch to C. (In the next chapter, some examples will be used for comparison. There may not be any special advantages in using C to handle it)
Through the benchmark data, the following specific conclusions can be drawn:
1.PHP itself has very good performance. It can reach 5000QPS under simple functions, and the limit can also exceed W.
2. The PHP framework itself has very limited impact on performance. Especially when there is certain business logic and data interaction, it can almost be ignored.
3. A standard PHP module, the benchmark performance can reach 2000QPS (80 cpu idle).
4. Comparative analysis
Many times, when people find that the performance of the PHP module is not good, they just say "ok, let's rewrite it in C". Within the company, the use of C/C++ to write business logic modules can be seen everywhere. In the past few years, almost all of them were written in C. At that time, what everyone wrote was really painful: debugging is difficult, and agility should not be discussed.