Home >Backend Development >PHP8 >Parsing PHP8 underlying kernel source code - array (3)

Parsing PHP8 underlying kernel source code - array (3)

藏色散人
藏色散人forward
2021-06-10 15:00:222986browse

This article introduces to you "Analysis of PHP8 underlying kernel source code - array (3)". It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.

Recommended related articles: "Analysis of PHP8 underlying kernel source code - array (1) " "Analysis of PHP8 underlying kernel source code - array (2) " Analysis of PHP8 underlying kernel source code - array (4)

The above has fully analyzed the basic structure implementation of arrays in PHP and the composition principle of indexes

Relying on the two structures _Bucket and _zend_array

The complexity of o(1) is realized through the hash function

But there is an index array before the bucket. I was understanding this index array at the time I walked through a lot of pitfalls

The picture below is $c =array('x'=>1,'y'=>2,'z'=>3,'a '=>0); The bucket structure of array c

Parsing PHP8 underlying kernel source code - array (3)
##As mentioned above, if it is a packed_array The index array is always 2 and it will not work

Because

If it is packed, the key is directly nullThere is no need to calculate the hash value. This index array is only used to quickly locate the h value

Parsing PHP8 underlying kernel source code - array (3)$a =array(1,2,3) bucket
typedef struct _Bucket {
zval              val;   //数组的值 ( 复习下 zval只有16个字节)
zend_ulong         h;     // key的 h  值
zend_string      *key;      //当数组为 hash_array时候 会用到 也就是 key的值  
} Bucket;
should be packed

array Don't let val affect your learning ideas at this time. The h value is equal to the subscript of the position of the array (arrays all start from 0, so the subscript also starts from 0). For example, $b =array(1=>'a',3=>'b',5=>'c'); mentioned above, where array b is also packed_array and has the following structure

Parsing PHP8 underlying kernel source code - array (3)
Because array b does not define the value of the 0th array, it is invalid. The content of $b[1] is 'a' Here I directly marked val=a(zval) on the picture. In fact, it is the zend_string of type string in the 16-byte zval. Here, the gc I learned before is used. There are many infinite nesting dolls in all PHP kernel source codes. You review the past and learn the new.

Come back and talk about it$c =array('x'=>1,'y'=>2,'z'=>3,'a'=>0);

The structure is as follows

Parsing PHP8 underlying kernel source code - array (3)
This h value is very large. I don’t know the hash value calculated by using key through time33. Why is it called a hash value? I think it is the h value calculated through time33 and then formed into a hash table

Parsing PHP8 underlying kernel source code - array (3)

Hash table mainly consists of two parts: storage element array and hash function. A simple hash function can use the remainder method. For example, if the size of the hash table is 8, then when the hash table initializes the array, allocate a space of 8 elements. Follow up the hash code of the key and divide it by 8. The value obtained is this The index of the element in the array. In this way, the key can be mapped to the specific location in the storage array

Parsing PHP8 underlying kernel source code - array (3)

But there is a problem in directly implementing the array in the above way: The position of the elements in the array is random and it is unordered

The array in PHP is ordered, so it adds an index table between the hash function and the element array. This index table is also an array. . The size is the same as the array in which the elements are stored. However, the element type it stores is always an integer, which is used to save the subscript of the element array in the actual stored array: the elements are inserted into the actual stored array in order, and then the array subscript is calculated according to the hash function. The position is stored in the newly added index.

Parsing PHP8 underlying kernel source code - array (3)

The first step is to calculate 4 and then find -4 in the index table. Because this is the 0th array, put it in the index table The value in the -4th array is set to 0, and then the 0th element in the real array table is set to the real assigned zval

key of different elements in the hash table. The final hash value calculated may be The same thing is that a hash conflict will occur when pointing to a subscript in the same index table. Because the index table can only store one element, PHP uses the zipper method to achieve hash conflict, which is to pull up the value in a linked list. You can refer to the picture below "PHP7 Kernel Analysis - Qin Peng"

Parsing PHP8 underlying kernel source code - array (3)
##Normal situation

val.u2.next The value is -1, which means that once a hash conflict occurs with the initial value, the value here will point to the true position of the array before the conflict.

▏This article was published on the PHP Chinese website with the consent of the original author PHP Cui Xuefeng. The original address: https://zhuanlan.zhihu.com/p/360952022

The above is the detailed content of Parsing PHP8 underlying kernel source code - array (3). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:zhihu.com. If there is any infringement, please contact admin@php.cn delete