Home>Article>Backend Development> PHP kernel layer parsing deserialization vulnerability

PHP kernel layer parsing deserialization vulnerability

藏色散人
藏色散人 forward
2019-04-02 12:00:00 3324browse

PHP kernel layer parsing deserialization vulnerability

Preface

In the process of learning PHP, I found that some PHP features are difficult to understand, such as PHP 00 truncation, MD5 flaws, deserialization bypass__wakeup, etc. I don't want to stick to the superficial understanding, but want to explore how the PHP core does it.

The following is a deserialization vulnerability commonly used in CTF, CVE-2016-7124 (bypassing the magic function __wakeup), as an example to share the process of debugging the PHP kernel. Including the entire part from the establishment of the kernel source code debugging environment, serialization and deserialization kernel source code analysis to the final vulnerability analysis. (Recommended:PHP Tutorial)

1. Thoughts triggered by an example

We can first look at the small example I wrote.

PHP kernel layer parsing deserialization vulnerability

Based on the picture above, we first introduce the magic functions in PHP:

Let’s first take a look at the official documentation for several commonly used magic functions. Introduction:

PHP kernel layer parsing deserialization vulnerability

Here is a brief summary, when a class is initialized as an instance,__constructwill be called, and when it is destroyed,__destruct will be called.

When a class callsserializefor serialization, the__sleepfunction is automatically called. When a string is to be deserialized usingunserializeWhen converted into a class, the__wakeupfunction will be called. The above magic functions will be called automatically if they exist. No need to manually make display calls yourself.

Now let’s look at the initial part of the code. In the__destructfunction, there are sensitive operations of writing files. We use deserialization to construct dangerous strings here, which may cause code execution vulnerabilities.

When we constructed the corresponding string and prepared to use it, we found that there was a filtering operation in its__wakeupfunction, which hindered our construction. Because we know that deserialization must first call the__wakeupfunction.

Here we can’t help but think of using this PHP deserialization vulnerability CVE-2016-7124 (bypassing the magic function __wakeup) to easily bypass the magic function ___wakeup that will be automatically called during deserialization. Sensitive operations are written to the file.

Of course, the above code is just a simple example that I personally gave, and there are many similar situations mentioned above in real situations. But this bypass method intrigues me a lot. How does PHP's internal operation and processing affect the upper-level code logic to cause such a magical situation (BUG). Next, I will conduct dynamic debugging analysis of the PHP kernel. Explore this question.

This vulnerability (CVE-2016-7124) affects versions of the PHP5 series before 5.6.25 and 7.x series before 7.0.10. So we will compile two versions later: one is version 7.3.0 that is not affected by this vulnerability, and the other version is version 5.6.10 where the vulnerability exists. Compare the two versions to learn more about the differences.

2. Setting up the PHP source code debugging environment

We all know that PHP is developed in C language, because the environment I use is WIN 10, Therefore, we mainly introduce the environment construction under Windows. We need the following materials:

PHP源码 PHP SDK工具包,用于构建PHP 调试所需要IDE

The source code can be downloaded on GITHUB, link: https://github.com/php/php-src, you can select the required version to download.

PHPSDK toolkit download address: https://github.com/Microsoft/php-sdk-binary-tools The toolkit downloaded from this address only supports VC14 and VC15. Of course, you can also find VC11, VC12, etc. that support lower versions of PHP from https://windows.php.net/downloads/. Before using the PHP SDK, you must ensure that you have VS installed with the corresponding version of the Windows SDK component.

PHP7.3.0 and 5.6.10 will be used in the following article. The source code compilation of these two versions will be introduced below. The methods for other versions are similar.

2.1 Compile Windows PHP 7.3.0

The native environment is WIN10 X64, and the PHP SDK is downloaded from the above github link. Enter the SDK directory and find 4 batch files. Double-clickphpsdk-vc15-x64here.

PHP kernel layer parsing deserialization vulnerability

Then enterphpsdk_buildtreephp7in this shell, you will find that the php7 folder appears in the same directory, and the shell directory has also changed.

PHP kernel layer parsing deserialization vulnerability

PHP kernel layer parsing deserialization vulnerability

Then we put the decompressed source code under \php7\vc15\x64. The shell enters this folder and uses thephpsdk_deps–update–branchmastercommand to update and download the relevant dependent components.

After waiting for completion, enter the source code directory and double-click thebuildconf.batbatch file. It will releaseconfigure.batandconfigure.jsTwo files, run configure–disable-all–enable-cli–enable-debug–enable-phar in the shell to configure the corresponding compilation options. If you have other needs, you can execute configure –help to view

PHP kernel layer parsing deserialization vulnerability

According to the prompts, use nmake to compile directly.

PHP kernel layer parsing deserialization vulnerability

Compilation is completed, and the executable file directory is in the php7\vc15\x64\php-src\x64\Debug_TS folder. We can enter php -v to view relevant information.

PHP kernel layer parsing deserialization vulnerability

2.2 Compile Windows PHP 5.6.10

The method is the same as 7.3.0, just pay attention to the use of PHP5.6 The WindowsSDK component version is VC11, you need to download VS2012, and you cannot use the PHP SDK downloaded from github for compilation. You need to select the VC11 PHP SDK and related dependent components on https://windows.php.net/downloads/ for compilation. The rest It is exactly the same as above and will not be repeated here.

PHP kernel layer parsing deserialization vulnerability

2.3 Debugging configuration

Because we have compiled the PHP interpreter above, we use VSCODE directly for debugging.

After downloading, install the C/C debugging extension.

PHP kernel layer parsing deserialization vulnerability

Then open the source code directory, click Debug -> Open Configuration, the launch.json file will open.

PHP kernel layer parsing deserialization vulnerability

According to the above figure, after configuring these three parameters, you can write PHP code in 1.php in the current directory, and set breakpoints in the PHP source code for direct debugging. .

The debugging environment is set up.

3. PHP deserialization source code analysis

Generally speaking of PHP deserialization, there are usually two functions that appear in pairs, serialize and unserialize, which are of course not necessary. There are also two magic methods __sleep() and __wakeup(). As we all know, serialization simply means that objects are stored in files, while deserialization is just the opposite. The objects are read out from the file and instantiated.

Next, based on the debugging environment set up above, we use dynamic debugging to intuitively reflect what serialization and deserialization is done in PHP (version 7.3.0).

3.1 serialize source code analysis

Let’s first write a simple Demo that does not contain the__sleepmagic function:

PHP kernel layer parsing deserialization vulnerability

Then we search globally for theserializefunction in the source code and locate this function in the var.c file. We put a breakpoint directly under the function header and start debugging.

PHP kernel layer parsing deserialization vulnerability

We can see that after doing some preparation work, we start to enter the serialization processing function, and we follow up thephp_var_serializefunction.

PHP kernel layer parsing deserialization vulnerability

We will continue to follow thephp_var_serialize_internfunction. The following is the main processing function. Because there are many function codes, we only intercept them here. The key part is that this function is also in the var.c file.

PHP kernel layer parsing deserialization vulnerability

The structure of the entire function is a switch case, and the type of the struc variant is parsed through the macro Z_TYPE_P (this macro expands to struc->u1.v.type) , to determine the type to be serialized, and then enter the corresponding CASE part for operation. The figure below shows the type definition.

PHP kernel layer parsing deserialization vulnerability

According to the number 8 in the red box above, we know that it needs to be serialized into an object at this timeIS_OBJECT, enter the corresponding CASE branch:

PHP kernel layer parsing deserialization vulnerability

We see the calling timing of the magic function__sleepin the picture above. Because this function does not exist in the Demo we wrote, the process will not enter this branch. Different branches represent different processing flows. We will look at the process with the magic function __sleep later.

PHP kernel layer parsing deserialization vulnerability

Because there is no process hit in the above case IS_OBJECT branch, and there is no break statement in the case, execution continues into the IS_ARRAY branch, where the class is extracted from the struc structure name, calculate its length and assign it to the buf structure, and extract the structure to be serialized in the class and store it in the hash array.

PHP kernel layer parsing deserialization vulnerability

The next step is to use thephp_var_serialize_internfunction to recursively parse the entire hash array, extract the variable name and value from it, perform format analysis and Splice the parsed string into the buf structure. Finally, when the whole process is completed, the entire string is completely stored in the flexible array structure buf.

PHP kernel layer parsing deserialization vulnerability

#It can be seen from the red box in the above picture that it is consistent with the final result. Let's slightly modify the Demo and add the magic function__sleep. According to the official documentation, the__sleepfunction must return an array. We also called a class member function in this function. Observe its specific behavior.

PHP kernel layer parsing deserialization vulnerability

#The previous process is exactly the same and will not be repeated here. Let’s start from the branch point.

PHP kernel layer parsing deserialization vulnerability

We directly follow up on thephp_var_serialize_call_sleepfunction.

PHP kernel layer parsing deserialization vulnerability

We continue to follow up herecall_user_function, according to the macro definition, it actually calls the_call_user_function_exfunction, Some copy actions have been done here, so no screenshots are taken. The process then proceeds to the call of thezend_call_functionfunction.

PHP kernel layer parsing deserialization vulnerability

In functionzend_call_function, in actual circumstances, we need to do some of our own things in__sleep, here PHP will push the operations to be done into PHP's ownzend_vmengine stack, and will be parsed one by one later (that is, parsing the corresponding OPCODE).

PHP kernel layer parsing deserialization vulnerability

The process here will hit this branch, we follow up thezend_execute_exfunction.

PHP kernel layer parsing deserialization vulnerability

We can see here that in ZEND_VM, the overall processing flow is a while(1) loop, which continuously parses the operations in the ZEND_VM stack. The ZEND_VM engine in the red box in the above figure will use theZEND_FASTCALLmethod to dispatch to the corresponding processing function.

PHP kernel layer parsing deserialization vulnerability

PHP kernel layer parsing deserialization vulnerability

Because we called the member function show in__sleep, here we first locate the show , and then the next operation will continue to be pushed into the ZEND_VM stack for the next round of new parsing (here, the operations in show are processed) until the entire operation is parsed. We will not follow up further here.

PHP kernel layer parsing deserialization vulnerability

Do you still remember the above outgoing parameter retval, which is the return value of __sleep? The picture above shows the first element x of the returned array. Of course, you can also Check directly in the variable.

After such a big circle, different paths lead to the same goal. After processing a series of operations in the _sleep function, we then use the php_var_serialize_class function to serialize the class name and recursively serialize the structure in the return value of its _sleep function. Finally, the results are stored in the buf structure. At this point, the entire process of serialization is completed.

3.1.1 Summary of serialize process

We summarize the serialization process:

When there is no magic function, serialize class name–> ;Use recursion to serialize the remaining structure

When there is a magic function, call the magic function __sleep–>Use the ZEND_VM engine to parse the PHP operation—>Return the array of the structure that needs to be serialized–>Serialization Class name –> Utilize recursive serialization of the return value structure of __sleep.

3.2 unserialize source code analysis

After reading the serialize process, next, we will look at theunserializeprocess from the simplest Demo. This example does not contain magic functions.

PHP kernel layer parsing deserialization vulnerability

The method is the same as above,unserializeThe source code is also in the var.c file.

PHP kernel layer parsing deserialization vulnerability

PHP kernel layer parsing deserialization vulnerability

The above picture involves the new features in PHP7, deserialization with filtering, according toallowed_classessettings to filter the corresponding PHP objects to prevent illegal data injection. The filtered object will be converted into a__PHP_Incomplete_Class object which cannot be used directly, but it has no impact on the deserialization process and will not be discussed in detail here. We follow up with the php_var_unserialize function.

PHP kernel layer parsing deserialization vulnerability

We continue to follow the

php_var_unserialize_internalfunction here.

PHP kernel layer parsing deserialization vulnerability

#The main internal operation process of this function is to parse the string and then jump to the corresponding processing process. The first letter 0 is parsed in the above figure, which represents the deserialization into an object.


PHP kernel layer parsing deserialization vulnerability

The object name will first be parsed, and a table lookup operation will be performed to confirm that the object does exist. Let's continue looking down.


PHP kernel layer parsing deserialization vulnerability

After completing the above operations, we created our own new object based on the object name new and initialized it, but our deserialization operation still did not work. Completed, we follow up on the

object_common2function.

Here we see the judgment and detection of magic functions, but the calling part is not here. Let's continue to follow the process_nested_data function.


PHP kernel layer parsing deserialization vulnerability

PHP kernel layer parsing deserialization vulnerability

It seems that this function uses a WHILE loop to nest the remaining parts of the parsing, including two php_var_unserialize_internal functions , the first one will parse the name, and the second one will parse the value corresponding to the name. After the process_nested_data function has finished running, the string parsing is completed, the main content of the deserialization operation has been completed, and the process is about to come to an end.


PHP kernel layer parsing deserialization vulnerability

Return layer by layer to the original function

PHP_FUNCTION. We see that there is some finishing work, release the applied space, and reverse the sequence. Completed. Our magic function__wakeupis not called here. In order to find out the calling timing of__wakeup, we modify the Demo here.

PHP kernel layer parsing deserialization vulnerability

#A new round of debugging begins here. It is found that after the serialization is completed, the call we want to see appears at the

PHP_VAR_UNSERIALIZE_DESTROYrelease space.

PHP kernel layer parsing deserialization vulnerability

Do you still remember the VAR_WAKEUP_FLAG flag when __wakeup is found in the deserialization process? Here, when traversing the bar_dtor_hash array and encountering this flag, the call to __wakeup is officially started. The later calling method is the same as the previous one. The __sleep calling method introduced is exactly the same, and will not be repeated here. At this point, all deserialization processes are completed.

3.2.1 Summary of the serialize process

We can see from the above that the deserialization process does not depend on whether the magic function appears compared to the serialization process. to create differences in the process. The Unserialize process is as follows:

Get the deserialized string –> Deserialize according to the type –> Look up the table to find the corresponding deserialization class –> Determine the number of elements based on the string –> new creates a new instance -> iteratively parses the remaining string -> determines whether there is a magic function __wakeup and marks it -> releases the space and determines whether it has a mark -> turns on the call.

4. PHP Deserialization Vulnerability

With the above source code foundation, let’s now explore the vulnerability CVE-2016-7124 (bypassing __wakeup) magic function.

Therefore, the vulnerability has certain version requirements. We use another PHP version (5.6.10) compiled above to reproduce and debug this vulnerability.

First we reproduce the vulnerability:

PHP kernel layer parsing deserialization vulnerability

We can see here that the TEST class only contains one element $a, and we are desequencing it here When modifying the value representing the number of elements in the element string, this vulnerability will be triggered. This class avoids the call of the magic function__wakeup.

Of course, an interesting phenomenon was also discovered during the process of triggering the vulnerability. This is not the only triggering method.

PHP kernel layer parsing deserialization vulnerability

#The deserialization operations corresponding to the four payloads in the above picture will trigger this vulnerability. Although the four below will trigger vulnerabilities, there are some minor differences. Here we slightly modify the code:

PHP kernel layer parsing deserialization vulnerability

We can see from the above figure that in the deserialized string, as long as the elements in the parsing class appear This vulnerability will be triggered whenever an error occurs. However, changing the internal operations of the class element (such as modifying the string length, class variable type, etc. in the figure above) will cause the class member variable assignment to fail. Only when the number of class members is modified (larger than the original number of members) can the success of class member assignment be guaranteed.

Let’s look at the problem through debugging:

According to our analysis of the deserialization source code in the third part, we guess that it may occur in the final parsed variable. problem. Here we go directly to the debugger for dynamic debugging:

PHP kernel layer parsing deserialization vulnerability

We can see that compared with the source code of version 7.3.0, this version has no filter parameters and has passed With so many versions of iterations, the processing process for lower versions now seems relatively simple. However, the overall harmonic logic has not changed. We directly follow the php_var_unserialize function here. The same logic will not be repeated again. We will directly follow the difference (object_common2 function), which is the code for processing member variables in the class.

PHP kernel layer parsing deserialization vulnerability

In the function object_common2, there are two main operations, process_nested_data iteratively parses the data in the class and the call of the magic function __wakeup, and when the process_nested_data function fails to parse, it directly returns a value of 0 , the subsequent __wakeup function will not have a chance to be called.

This explains why there is more than one payload that triggers the vulnerability.

PHP kernel layer parsing deserialization vulnerability

When only the number of class members is modified, the while loop can be completed once, which allows the member variables in our class to be completely assigned. When modifying the internal member variables, the pap_var_unserialize function call fails, and then the zval_dtor and FREE_ZVAL functions are called to release the current key (variable) space, causing the variable assignment in the class to fail.

On the other hand, in the PHP7.3.0 version, there is no calling process here, it is just marked simply, and the timing of the entire magic function calling process is moved to the point where the data is released. This avoids this bypass problem. This vulnerability should be caused by a logical flaw.

The above is the detailed content of PHP kernel layer parsing deserialization vulnerability. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:freebuf.com. If there is any infringement, please contact admin@php.cn delete