Home  >  Article  >  Backend Development  >  PHP kernel decryption series: execution process of zend_execute

PHP kernel decryption series: execution process of zend_execute

WBOY
WBOYOriginal
2016-08-08 09:22:341462browse

PHP Kernel Decryption Series: The execution process of zend_execute

The function that the interpreter engine finally executes op is zend_execute. In fact, zend_execute is a function pointer. When the engine is initialized, zend_execute points to execute by default. This execute is defined in { PHPSRC}/Zend/zend_vm_execute.h:
ZEND_API void execute(zend_op_array *op_array TSRMLS_DC) { zend_execute_data *execute_data; zend_bool nested = 0; zend_bool original_in_execution = EG(in_execution); if (EG(exception)) { return; } EG(in_execution) = 1; zend_vm_enter: /* Initialize execute_data */ execute_data = (zend_execute_data *)zend_vm_stack_alloc( ZEND_MM_ALIGNED_SIZE(sizeof(zend_execute_data)) + ZEND_MM_ALIGNED_SIZE(sizeof(zval**) * op_array->last_var * (EG(active_symbol_table) ? 1 : 2)) + ZEND_MM_ALIGNED_SIZE(sizeof(temp_variable)) * op_array->T TSRMLS_CC); EX(CVs) = (zval***)((char*)execute_data + ZEND_MM_ALIGNED_SIZE(sizeof(zend_execute_data))); memset(EX(CVs), 0, sizeof(zval**) * op_array->last_var); EX(Ts) = (temp_variable *)(((char*)EX(CVs)) + ZEND_MM_ALIGNED_SIZE(sizeof(zval**) * op_array->last_var * (EG(active_symbol_table) ? 1 : 2))); EX(fbc) = NULL; EX(called_scope) = NULL; EX(object) = NULL; EX(old_error_reporting) = NULL; EX(op_array) = op_array; EX(symbol_table) = EG(active_symbol_table); EX(prev_execute_data) = EG(current_execute_data); EG(current_execute_data) = execute_data; EX(nested) = nested; nested = 1; if (op_array->start_op) { ZEND_VM_SET_OPCODE(op_array->start_op); } else { ZEND_VM_SET_OPCODE(op_array->opcodes); } if (op_array->this_var != -1 && EG(This)) { Z_ADDREF_P(EG(This)); /* For $this pointer */ if (!EG(active_symbol_table)) { EX(CVs)[op_array->this_var] = (zval**)EX(CVs) + (op_array->last_var + op_array->this_var); *EX(CVs)[op_array->this_var] = EG(This); } else { if (zend_hash_add(EG(active_symbol_table), "this", sizeof("this"), &EG(This), sizeof(zval *), (void**)&EX(CVs)[op_array->this_var])==FAILURE) { Z_DELREF_P(EG(This)); } } } EG(opline_ptr) = &EX(opline); EX(function_state).function = (zend_function *) op_array; EX(function_state).arguments = NULL; while (1) { int ret; #ifdef ZEND_WIN32 if (EG(timed_out)) { zend_timeout(0); } #endif if ((ret = EX(opline)->handler(execute_data TSRMLS_CC)) > 0) { switch (ret) { case 1: EG(in_execution) = original_in_execution; return; case 2: op_array = EG(active_op_array); goto zend_vm_enter; case 3: execute_data = EG(current_execute_data); default: break; } } } zend_error_noreturn(E_ERROR, "Arrived at end of main loop which shouldn't happen"); }
The parameter of this function is op_array, which is a pointer to zend_op_array. op_array is generated during the compilation process. It is necessary to introduce the type of zend_op_array here.
Reference source: http://www.lai18.com/content/425167.html

zend_op_array introduction

This type is defined in {PHPSRC}/Zend/zend_compile.h:
struct _zend_op_array { /* Common elements */ zend_uchar type; char *function_name; zend_class_entry *scope; zend_uint fn_flags; union _zend_function *prototype; zend_uint num_args; zend_uint required_num_args; zend_arg_info *arg_info; zend_bool pass_rest_by_reference; unsigned char return_reference; /* END of common elements */ zend_bool done_pass_two; zend_uint *refcount; zend_op *opcodes; zend_uint last, size; zend_compiled_variable *vars; int last_var, size_var; zend_uint T; zend_brk_cont_element *brk_cont_array; int last_brk_cont; int current_brk_cont; zend_try_catch_element *try_catch_array; int last_try_catch; /* static variables support */ HashTable *static_variables; zend_op *start_op; int backpatch_count; zend_uint this_var; char *filename; zend_uint line_start; zend_uint line_end; char *doc_comment; zend_uint doc_comment_len; zend_uint early_binding; /* the linked list of delayed declarations */ void *reserved[ZEND_MAX_RESERVED_RESOURCES]; }; typedef struct _zend_op_array zend_op_array;
This structure is more complicated, we currently only Introduce the most basic fields.
1.type: The type of
op_array. The first thing to note is that after a piece of PHP code is compiled, although a zend_op_array pointer is returned, there may actually be more than one zend_op_array structure generated. Through some fields in this structure, For example, function_name, num_args, etc. You may find that this zend_op_array structure seems to have a certain connection with functions. Indeed, user-defined functions and user-defined class methods are all zend_op_array structures. These zend_op_array structures are used during the compilation process. are saved in certain places. For example, user-defined functions are saved in GLOBAL_FUNCTION_TABLE. This is the global function symbol table. The function body can be retrieved in this table through the function name. So what is the zend_op_array pointer returned after compilation? In fact, the zend_op_array returned after compilation is an entry point for execution. It can also be considered as the outermost layer, that is, the op_array composed of global code that is not in any function body. However, global code, user-defined functions, and user-defined methods all have the same type value: 2 , the macro definition of the possible value of type is:
#define ZEND_INTERNAL_FUNCTION 1 #define ZEND_USER_FUNCTION 2 #define ZEND_OVERLOADED_FUNCTION 3 #define ZEND_EVAL_CODE 4 #define ZEND_OVERLOADED_FUNCTION_TEMPORARY 5
You can see that the global code, user functions, and user methods all correspond to ZEND_USER_FUNCTION, which is also the most common type. ZEND_EVAL_CODE corresponds to the PHP code in the eval function. So we can imagine that the PHP code in the eval function parameter will also be compiled into a separate zend_op_array.
2.function_name
If op_array is generated by compiling a user-defined function or method, then this field corresponds to the name of the function. If it is global code or code in the eval part, then this field is control.
3.opcodes
This field type is zend_op *, so this is an array of zend_op. This array saves the ops generated during this compilation process. If you don’t know zend_op, you can read the previous article Introduction to OPcode. This field is The most important part, zend_execute ultimately executes the op saved here.
Now that we have a basic understanding of the parameter op_array, we will start to enter execute.

Detailed explanation of the execution process

execute函数开始的时候是一些基础变量的申明,其中zend_execute_data *execute_data;是执行期的数据结构,此变量在进行一定的初始化之后将会被传递给每个op的handler函数作为参数,op在执行过程中随时有可能改变execute_data中的内容。
第14行zend_vm_enter 这个跳转标签是作为虚拟机执行的入口,当op中涉及到函数调用的时候,就有可能会跳转到这里来执行函数体。
第16行到第19行为execute_data分配空间
第21行到第32行主要是对execute_data进行一些初始化,以及保存现场工作,要保存现场是因为在进入函数调用的时候,需要保存当前一些运行期间的数据,在函数调用结束之后再进行还原,可以想象为操作系统中进程调度,当进程在调出的时候需要保存寄存器等上下文环境,而当进程被调入的时候再取出来继续执行。
第41行到第51行主要是在当前动态符号表中加入$this变量,这个是在调用对象的方法时才有必要进行。
第58行开始的while无限循环就是开始执行op_array中的opcodes了,在第66行中调用当前执行的op的handler:
EX(opline)->handler(execute_data TSRMLS_CC))
然后如果handler的返回值小于0则循环继续,如果大于0则进入一个switch结构:
当返回值为1时:execute函数将返回,执行也就结束了。
当返回值为2时:op_array被重新设置,并跳转到zend_vm_enter ,这个一般是函数调用或则执行eval函数中的代码,将在新的上下文执行相关函数的op_array
当返回值为3时:循环体继续继续执行,当然再继续执行之前,EX(opline)已经往后移了一位(可能多位),也就是已经指向了后面一个新的opline,于是继续执行新的opline
当返回其他值时:结束循环,报错,结束应该用return,也就是返回1
在op的handler中返回特定的值都被定义成了宏,例如{PHPSRC}/Zend/zend_execute.c中定义的:
#define ZEND_VM_NEXT_OPCODE() / CHECK_SYMBOL_TABLES() / EX(opline)++; / ZEND_VM_CONTINUE() #define ZEND_VM_SET_OPCODE(new_op) / CHECK_SYMBOL_TABLES() / EX(opline) = new_op #define ZEND_VM_JMP(new_op) / CHECK_SYMBOL_TABLES() / if (EXPECTED(!EG(exception))) { / EX(opline) = new_op; / } / ZEND_VM_CONTINUE() #define ZEND_VM_INC_OPCODE() / EX(opline)++
以及在{PHPSRC}/Zend/zend_vm_execute.c中定义的:
#define ZEND_VM_CONTINUE() return 0 #define ZEND_VM_RETURN() return 1 #define ZEND_VM_ENTER() return 2 #define ZEND_VM_LEAVE() return 3 #define ZEND_VM_DISPATCH(opcode, opline) return zend_vm_get_opcode_handler(opcode, opline)(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);
简单介绍功能
ZEND_VM_NEXT_OPCODE():移动到下一条op,返回0,不进入switch,循环继续(这个是最常用到的)
ZEND_VM_SET_OPCODE(new_op):当前opline设置成new_op
ZEND_VM_JMP(new_op) :当前opline设置成new_op,返回0,不进入switch,循环继续
ZEND_VM_INC_OPCODE():仅仅移动到下一条op

执行环境的切换

在前面的内容已经提到,用户自定义函数,类方法,eval的代码都会编译成单独的op_array,那么当进行函数调用等操作时,必然涉及到调用前的op_array执行环境和新的函数的op_array执行环境的切换,这一段我们将以调用用户自定义函数来介绍整个切换过程如何进行。
介绍此过程前必须了解执行环境的相关数据结构,涉及到执行环境的数据结构主要有两个:
1. 执行期全局变量结构
相关的定义在{PHPSRC}/Zend/zend_globals_macros.h:
/* Executor */ #ifdef ZTS # define EG(v) TSRMG(executor_globals_id, zend_executor_globals *, v) #else # define EG(v) (executor_globals.v) extern ZEND_API zend_executor_globals executor_globals; #endif
这里是一个条件编译,ZTS表示线程安全启用,为了简化,我们这里以非线程安全模式的情况下来介绍,那么执行期的全局变量就是executor_globals,其类型为zend_executor_globals, zend_executor_globals的定义在{PHPSRC}/Zend/zend_globals.h,结构比较庞大,这里包含了整个执行期需要用到的各种变量,无论是哪个op_array在执行,都共用这一个全局变量,在执行过程中,此结构中的一些成员可能会改变,比如当前执行的op_array字段active_op_array,动态符号表字段active_symbol_table可能会根据不同的op_array而改变,This指针会根据在不同的对象环境而改变。
另外还定义了一个EG宏来取此变量中的字段值,此宏是针对线程安全和非线程安全模式的一个封装。
2.每个op_array自身的执行数据
针对每一个op_array,都会有自己执行期的一些数据,在函数execute开始的时候我们能看到zend_vm_enter跳转标签下面就会初始一个局部变量execute_data,所以我们每次切换到新的op_array的时候,都会为新的op_array建立一个execute_data变量,此变量的类型为zend_execute_data的指针,相关定义在{PHPSRC}/Zend/zend_compile.h:
struct _zend_execute_data { struct _zend_op *opline; zend_function_state function_state; zend_function *fbc; /* Function Being Called */ zend_class_entry *called_scope; zend_op_array *op_array; zval *object; union _temp_variable *Ts; zval ***CVs; HashTable *symbol_table; struct _zend_execute_data *prev_execute_data; zval *old_error_reporting; zend_bool nested; zval **original_return_value; zend_class_entry *current_scope; zend_class_entry *current_called_scope; zval *current_this; zval *current_object; struct _zend_op *call_opline; };
可以用EX宏来取其中的值:#define EX(element) execute_data->element
这里只简单介绍其中两个字段:
 opline: 当前正在执行的op。
 prev_execute_data:  op_array环境切换的时候,这个字段用来保存切换前的op_array,此字段非常重要,他能将每个op_array的execute_data按照调用的先后顺序连接成一个单链表,每当一个op_array执行结束要还原到调用前op_array的时候,就通过当前的execute_data中的prev_execute_data字段来得到调用前的执行器数据。
在executor_globals中的字段current_execute_data就是指向当前正在执行的op_array的execute_data。  
再正式介绍之前还需要简单的介绍一下用户自定义函数的调用过程,详细的过程以后再函数章节中专门介绍,这里简单的说明一下:
在调用函数的时候,比如test()函数,会先在全局函数符号表中根据test来搜索相关的函数体,如果搜索不到则会报错函数没有定义,找到test的函数体之后,取得test函数的op_array,然后跳转到execute中的goto标签:zend_vm_enter,于是就进入到了test函数的执行环境。
下面我们将以一段简单的代码来介绍执行环境切换过程,例子代码:

这段代码非常简单,这样方便我们介绍原理,复杂的代码读者可以举一反三。此代码编译之后会生成两个op_array,一个是全局代码的op_array,另外一个是test函数的op_array,其中全局代码中会通过函数调用进入到test函数的执行环境,执行结束之后,会返回到全局代码,然后代码结束。
下面我们分几个阶段来介绍这段代码的过程,然后从中可以知道执行环境切换的方法。
1. 进入execute函数,开始执行op_array ,这个op_array就是全局代码的op_array,我们暂时称其为op_array1
首先在execute中为op_array1建立了一个execute_data数据,我们暂时命名为execute_data1,然后进行相关的初始化操作,其中比较重要的是:
EX(op_array) = op_array; // 设置op_array字段为当前执行的op_array,也就是全局代码的op_array1 EX(prev_execute_data) = EG(current_execute_data);//将全局执行数据中保存的当前op_array执行数据保存到op_array1的execute_data1的prev_execute_data字段,由于这是执行的第一个op_array,所以prev_execute_data实际上是空值,然后将执行期全局变量的current_execute_data设置成execute_data1,然后设置execute_data1的当前执行op,这样就可以开始执行当前的op了
2. 在op_array1执行到test函数调用的的时候,首先从全局函数符号表中找到test的函数体,将函数体保存在execute_data1的function_state字段,然后从函数体中取到test的op_array,我们这里用op_array2来表示,并将op_array2赋值给EG(active_op_array):
EG(active_op_array) = &EX(function_state).function->op_array;
于是执行期全局变量的动态op_array字段指向了函数test的op_array,然后用调用ZEND_VM_ENTER();这个时候会先回到execute函数中的switch结构,并且满足以下case
case 2: op_array = EG(active_op_array); goto zend_vm_enter;
EG(active_op_array)之前已经被我们设置为test函数的op_array2,于是在函数execute中,op_array变量就指向了test的op_array2,然后跳转到zend_vm_enter。
3. 跳转到zend_vm_enter之后其实又回到了类似1中的步骤,此时为test的op_array2建立了它的执行数据execute_data,我们这里用execute_data2来表示。Somewhat different from 1 is EX(prev_execute_data) = EG(current_execute_data); at this time, current_execute_data = execute_data1, which is the execution data of the global code, and then EG(current_execute_data) = execute_data; in this way, current_execute_data is equal to the execution data of test execute_data2, and execute_data1 of the global code is saved in the prev_execute_data field of execute_data2. At this time, the environment switching has been completed, so the test function begins to be executed.
4. After the test function is executed, it will return to the execution environment before the call, which is the global code execution environment. The most important operation at this stage is EG(current_execute_data) = EX(prev_execute_data); EX(prev_execute_data) in 3 ) has been set to execute_data1 of the global code, so the current execution data becomes the execution data of the global code. In this way, the function test execution environment is successfully returned to the global code execution environment. In this way, the execution environment switching process is completed. Yes, for deep function calls, the principle is the same, and the singly linked list composed of execution data execute_data will be longer.

Extended reading

The list of topics in this article is as follows:

PHP kernel exploration: starting from the SAPI interface
PHP kernel exploration: the beginning and end of a request
PHP kernel exploration: the life cycle of a request
PHP kernel exploration: single-process SAPI life Cycles
PHP Kernel Exploration: SAPI Lifecycle for Multi-Processes/Threads
PHP Kernel Exploration: Zend Engine
PHP Kernel Exploration: SAPI Revisited
PHP Kernel Exploration: Introduction to Apache Modules
PHP Kernel Exploration: Support for PHP via mod_php5
PHP Kernel Exploration : Apache Runtime with Hook Functions
PHP Kernel Exploration: Embedded PHP
PHP Kernel Exploration: FastCGI for PHP
PHP Kernel Exploration: How to Execute PHP Scripts
PHP Kernel Exploration: Execution Details of PHP Scripts
PHP Kernel Exploration: OpCode
PHP kernel exploration: opcode in PHP
PHP kernel exploration: the execution process of the interpreter
PHP kernel exploration: variable overview
PHP kernel exploration: variable storage and type
PHP kernel exploration: hash table in PHP
PHP kernel exploration: Understanding hash tables in Zend
PHP Kernel Exploration: PHP Hash Algorithm Design
PHP Kernel Exploration: Translating a HashTables article
PHP Kernel Exploration: What is a hash collision attack?
PHP kernel exploration: implementation of constants
PHP kernel exploration: storage of variables
PHP kernel exploration: types of variables
PHP kernel exploration: value operations of variables
PHP kernel exploration: creation of variables
PHP kernel exploration: predefined variables
PHP kernel exploration: Retrieval of variables
PHP kernel exploration: Type conversion of variables
PHP kernel exploration: Implementation of weakly typed variables
PHP kernel exploration: Implementation of static variables
PHP kernel exploration: Variable type hints
PHP kernel exploration: Variables Life cycle
PHP kernel exploration: variable assignment and destruction
PHP kernel exploration: variable scope
PHP kernel exploration: weird variable names
PHP kernel exploration: variable value and type storage
PHP kernel exploration: global variable Global
PHP kernel Exploration: Conversion of variable types
PHP kernel exploration: The beginning of memory management
PHP kernel exploration: Zend memory manager
PHP kernel exploration: PHP's memory management
PHP kernel exploration: memory application and destruction
PHP kernel exploration: reference counting and Copy-on-write
PHP kernel exploration: the garbage collection mechanism of PHP5.3
PHP kernel exploration: cache in memory management
PHP kernel exploration: copy-on-write COW mechanism
PHP kernel exploration: arrays and linked lists
PHP kernel exploration: using Ha Hash table API
PHP kernel exploration: array operations
PHP kernel exploration: array source code analysis
PHP kernel exploration: classification of functions
PHP kernel exploration: internal structure of functions
PHP kernel exploration: function structure conversion
PHP kernel exploration: defining functions The process of
PHP kernel exploration: function parameters
PHP kernel exploration: zend_parse_parameters function
PHP kernel exploration: function return value
PHP kernel exploration: formal parameter return value
PHP kernel exploration: function call and execution
PHP kernel exploration: reference and Function execution
PHP core exploration: anonymous functions and closures
PHP core exploration: object-oriented beginning
PHP core exploration: structure and implementation of classes
PHP core exploration: member variables of a class
PHP core exploration: member methods of a class
PHP Kernel Exploration: The prototype of a class zend_class_entry
PHP Kernel Exploration: Definition of a class
PHP Kernel Exploration: Access Control
PHP Kernel Exploration: Inheritance, Polymorphism and Abstract Classes
PHP Kernel Exploration: Magic Functions and Lazy Binding
PHP Kernel Exploration: Reserved classes and special classes
PHP core exploration: objects
PHP core exploration: creating object instances
PHP core exploration: object attribute reading and writing
PHP core exploration: namespaces
PHP core exploration: defining interfaces
PHP core exploration: inheritance and implementation Interface
PHP kernel exploration: resource resource type
PHP kernel exploration: Zend virtual machine
PHP kernel exploration: lexical analysis of virtual machine
PHP kernel exploration: syntax analysis of virtual machine
PHP kernel exploration: execution of intermediate code opcode
PHP kernel Exploration: Encryption and decryption of code
PHP kernel exploration: The specific execution process of zend_execute
PHP kernel exploration: Reference and counting rules of variables
PHP kernel exploration: Description of the new garbage collection mechanism

The above introduces the PHP kernel decryption series: the execution process of zend_execute, including the relevant content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn