Home>Article>Backend Development> PHP 8 new features JIT understanding

PHP 8 new features JIT understanding

Guanhui
Guanhui forward
2020-05-29 09:24:40 3865browse

PHP 8 new features JIT understanding

PHP 8’s JIT (Just In Time) compiler will be integrated into php as an extension. The Opcache extension is used to directly convert certain opcodes into From cpu instructions.

This means that using JIT, Zend VM does not need to interpret certain opcodes, and these instructions will be executed directly as CPU-level instructions.

PHP 8’s JIT

The impact of the PHP 8 Just In Time (JIT) compiler is unquestionable. But so far, I've found that very little is known about what JIT is supposed to do.

After much research and abandonment, I decided to check out the PHP source code myself. Combining some of my knowledge of the C language and all the bits and pieces of information I've gathered so far, I've come up with this article, which I hope will help you understand PHP's JIT better.

To put it simply: When the JIT works as expected, your code is not executed through the Zend VM, but directly as a set of CPU-level instructions.

That's the whole idea.

But to understand it better we need to consider how php works internally. Not very complicated, but needs some introduction.

How is PHP code executed?

As we all know, PHP is an interpreted language, but what does this sentence itself mean?

Every time PHP code (command line script or WEB application) is executed, it must go through the PHP interpreter. The most commonly used are the PHP-FPM and CLI interpreters.

The interpreter's job is simple: receive PHP code, interpret it, and return the result.

General interpreted languages follow this process. Some languages may eliminate a few steps, but the general idea is the same. In PHP, the process is as follows:

Reads the PHP code and interprets it as a set of keywords called Tokens. This process lets the interpreter know what code has been written in each program. This step is called Lexing or Tokenizing. After getting the Tokens collection, the PHP interpreter will try to parse them. An abstract syntax tree (AST) is generated through a process called Parsing. Here AST is a set of nodes representing what operations to perform. For example, "echo 1 1" actually means "print the result of 1 1" or more specifically "print an operation, this operation is 1 1". With an AST, it's easier to understand operations and priorities. Converting an abstract syntax tree into an operation that can be executed by the CPU requires a transition expression (IR), which in PHP we call Opcodes. The process of converting ASTs into Opcodes is called compilation. With Opcodes, here comes the fun part: executing the code! PHP has an engine called Zend VM that is able to receive a sequence of Opcodes and execute them. After all Opcodes are executed, Zend VM terminates the program.

This picture can make it clearer for you:

A simplified version of the PHP interpretation process overview.

As you can see. Here is a question: Even if the PHP code has not changed, will this process still be followed every time it is executed?

Let’s look back at Opcodes. correct! This is why Opcache extensions exist.

Opcache Extension

The Opcache extension comes with PHP and there is usually no need to disable it. When using PHP it is best to turn on Opcache.

Its function is to add a memory shared cache layer to Opcodes. Its job is to extract newly generated Opcodes from the AST and cache them so that the Lexing/Tokenizing and Parsing steps can be skipped during execution.

This is a process diagram including the Opcache extension:

PHP uses Opcache interpretation process. If the file has already been parsed, PHP will get cached Opcodes for it instead of parsing it again.

Perfectly skips the Lexing/Tokenizing, Parsing and Compiling steps.

Side note: This is the awesome PHP 7.4 Preloading Features RFC! Allows you to tell PHP FPM to parse the code base, convert it into Opcodes and cache it before executing it.

Do you want to know how JIT is involved in this interpretation process? This article will explain.

What is the effect of Just In Time compilation?

After listening to Zeev's PHP and JIT broadcast on PHP Internals News, I figured out what JIT actually does.

If the Opcache extension can get Opcodes faster and transfer them directly to the Zend VM, the JIT allows them to run without using the Zend VM at all.

Zend VM is a program written in C that acts as a layer between Opcodes and the CPU. JIT generates compiled code directly at runtime, so PHP can

skip the Zend VM and be executed directly by the CPU. In theory, performance will be better.

This sounds strange because a specific implementation needs to be written for each type of structure before it can be compiled into machine code. But in fact this is reasonable.

PHP's JIT uses a library called DynaASM (Dynamic Assembler), which maps a set of CPU instructions in a specific format into assembly code for many different CPU types. Therefore, the compiler only needs to use DynASM to convert Opcodes into machine code for a specific structure.

However, there is a problem that has troubled me for a long time.

If preloading can parse PHP code into Opcodes before execution, and DynASM can compile Opcodes into machine code (Just In Time compilation), why don't we use pre-run compilation (Ahead of Time compilation) immediately What about compiling PHP immediately?

One of the reasons I found out by listening to Zeev's broadcast is that PHP is a weakly typed language, which means that PHP often doesn't know the type of a variable until the Zend VM tries to execute an opcode.

You can check the Zend_value union type to learn that many pointers point to variables of different types. Whenever Zend VM tries to get a value from a Zend_value, it uses a macro like ZSTR_VAL to get a pointer to a string in the union type.

For example, this Zend VM handler handles "less than or equal to" (<=) expressions. See how it encodes so many if else branches just for type inference.

Using machine code to perform type inference logic is not feasible and may become slower.

Evaluating first and then compiling is also not a good option because compiling to machine code is a CPU-intensive task. So compiling everything at runtime is also not good.

So how is Just In Time compiled?

Now we know that types cannot be well inferred to compile ahead of time. We also know that compilation at runtime is computationally expensive. So what are the benefits of JIT for PHP?

In order to seek balance, PHP's JIT tries to compile only valuable Opcodes. To do this, the JIT analyzes the Opcodes that the Zend VM is going to execute and checks for possible compilations. (According to the configuration file)

When an Opcode is compiled, it will hand execution to the compiled code instead of to Zend VM. It looks like this:

PHP’s JIT interpretation process. If compiled, Opcodes are not executed by Zend VM.

Therefore, in the Opcache extension, there are two detection instructions to determine whether to compile Opcode. If so, the compiler will use DynASM to convert this Opcode to machine code and execute this machine code.

Interestingly, since there is a MB limit (also configurable) on compiled code in the current interface, code execution must be able to switch seamlessly between JIT and interpreted code.

By the way, this talk by Benoit Jacquemont on JIT in php helped me understand this whole thing.

I'm still not sure when the compilation part was effectively done, but I guess right now I really don't want to know.

So your performance gains probably won't be huge

I hope it's clear now why most php applications don't get that much from using a just-in-time compiler Big performance gains. This is why Zeev recommends that profiling and experimenting with different JIT configurations for your application is the best approach.

If you're using PHP FPM, it's common to share compiled opcodes across multiple requests, but that's still not a game changer.

This is because JIT optimizes computationally intensive operations, and most php applications today are more I/O bound than anything else. If you're going to access disk or network anyway, it doesn't matter whether the processing operation is compiled or not. The timing will be very similar.

Unless...

You are doing something that is not I/O bound, like image processing or machine learning. Anything that doesn't touch I/O will benefit from a JIT compiler.

This is why people now say that we prefer to write native functions in PHP instead of writing in C. If this function were to be compiled anyway, the overhead would be unexpressive.

Recommended tutorial: "PHP7"

The above is the detailed content of PHP 8 new features JIT understanding. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete