Home >Backend Development >PHP8 >Parsing PHP8 underlying kernel source code - array (2)

Parsing PHP8 underlying kernel source code - array (2)

藏色散人
藏色散人forward
2021-06-10 14:50:132564browse

This article introduces to you "Analysis of PHP8 underlying kernel source code - array (2)". It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.

Recommended related articles: "Analysis of PHP8 underlying kernel source code - array (1) " "Analysis of PHP8 underlying kernel source code - array (3) " " Analysis of PHP8 underlying kernel source code - array (4)

zend_array is divided into two types in PHP

1.packed array
2.hash array

在上文中 补齐了zend_array的 所有值的 注释

In fact, the order in the source code is slightly different from my above. I think my above order is more reasonable to understand.

//源码里的代码
typedef struct _zend_array HashTable;
struct _zend_array {
zend_refcounted_h gc;
union {
struct {
ZEND_ENDIAN_LOHI_4(
zend_uchar    flags,
zend_uchar    _unused,
zend_uchar    nIteratorsCount,
zend_uchar    _unused2)
} v;
uint32_t flags;
} u;
uint32_t          nTableMask;
Bucket           *arData;
uint32_t          nNumUsed;
uint32_t          nNumOfElements;
uint32_t          nTableSize;
uint32_t          nInternalPointer;
zend_long         nNextFreeElement;
dtor_func_t       pDestructor;
};
//我调换下顺序后的代码
struct _zend_array {
zend_refcounted_h gc; 
 ///  gc  占用8个字节 用于引用计数和  字符串类型的记录
union {
struct {
ZEND_ENDIAN_LOHI_4(
zend_uchar    flags,
// flags   8位的无符号字符, 最大值为255   标记HashTable用 PHP8 中有6个值
zend_uchar    _unused,
zend_uchar    nIteratorsCount,
//迭代器计数。foreach语句会在全局变量EG中创建一个迭代器,
//迭代器包含正在遍历的HashTable和游标信息。
//nIteratorsCount记录了当前runtime正在迭代当前HashTable的迭代器的数量。
zend_uchar    _unused2)
} v;
  //这里有点不一样 看陈雷大佬书中 v结构体还包括 u.v.nApplyCount和u.v.consistency
uint32_t flags;
             //
} u;
// u是是一个联合体。占用4个字节。
//可以存储一个uint32_t类型的flags,也可以存储由4个unsigned char组成的结构体v,
//这里的宏ZEND_ENDIAN_LOHI_4是为了兼容不同操作系统的大小端,可以忽略。
Bucket           *arData;
//HashTable中存储数据的单元的指针。
//  用来存储key和value以及辅助信息的容器。
uint32_t          nTableSize;
//    HashTable的大小。表示arData指向的bucket数组的大小,即所有bucket的数量。
//该字段取值始终是2n,最小值是8,最大值在64位系统中是0x80000000(2的31次幂)。
uint32_t          nNumUsed;
//指所有已使用bucket的数量,包括有效bucket和无效bucket的数量
uint32_t          nNumOfElements;
//有效bucket的数量。该值总是小于或等于nNumUsed
uint32_t          nTableMask;
//索引大小。一般值为  -nTableSize。
uint32_t          nInternalPointer;
//全局默认游标。reset/key/current/next/prev等宏 和操作都会用到
zend_long         nNextFreeElement;
//下一个插入的元素的key的下标  
//比如  当$a[] = 1  nNextFreeElement =1  
dtor_func_t       pDestructor;
//指向一个函数   typedef void (*dtor_func_t)(zval *pDest);
//可以看出是pDest是zval结构指针二级指针,
//为什么会是二级指针,因为c语言函数传递都是值传递,要改变指针值只能将指针地址传入
//当bucket元素被更新或者被删除时,会对bucket的value调用该函数,
//如果value是引用计数的类型,那么会对value引用计数减1,进而引发可能的gc。
};

The member variable diagram generated by the understand tool is as follows

Parsing PHP8 underlying kernel source code - array (2)

After all expansions are as follows

Parsing PHP8 underlying kernel source code - array (2)

zend_array structure member

It can be seen that the core is z_val zend_string zend_refcounted_h Bucket Layers upon layers

The Bucket stores the key information of the array

typedef struct _Bucket {
zval              val;   //数组的值 ( 复习下 zval只有16个字节)
zend_ulong         h;     // key的 h  值
zend_string      *key;      //当数组为 hash_array时候 会用到 也就是 key的值  
} Bucket;

No matter the array type is packed_array or hash_array, it will eventually be stored in the Bucket

When the keys are all numeric keys and the keys are increasing in insertion order, the array type is packed_array

##Characteristics of packed array

  1. No need to index the array
  2. No key is needed
  3. The h value of the array without key is directly Equal to the sorting value of the space in the bucket starting from 0
  4. An array of key-value pairs The h value is equal to the content of the key

where The third and fourth items can be understood as if the array in PHP does not write a key, then the default key will be sorted starting from 0

$a =array(1,2,3);  // packed array
$b =array(1=>'a',3=>'b',5=>'c'); //packed array
Parsing PHP8 underlying kernel source code - array (2)

There will be an index array before the bucket array

When it is a packed array, the size of the index array is always 2 because it is not used It

The content in the zend_array corresponding to $a is

Parsing PHP8 underlying kernel source code - array (2)$a's zend_array
nTableSize; Represents the size of the bucket array pointed to by arData, that is, the number of all buckets. =The total size of the array

nNumUsed; Refers to the number of all used buckets, including the number of valid and invalid buckets

nNumOfElements; The number of valid buckets.

So nNumOfElements nNumUsed =nTableSize

nTableMask; Index size. Because packed array does not use an index, it is always -2

nNextFreeElement; The subscript of the key of the next inserted element

packed array takes advantage of the continuity characteristics of the bucket array. For some Optimization for scenarios with only digital keys. Since the index array is no longer needed, (nTableSize-2)* sizeof(uint32_t) bytes are saved from the memory space. In addition, since accessing the bucket directly operates the bucket array, the performance is also improved.

If the conditions of packed array are not met, the array is represented by hash_array in PHP

All key values ​​that are not numbers are represented by hash_array

$c =array('x'=>1,'y'=>2,'z'=>3,'a'=>0);

The $c above will be represented by hash_array

bucket is as follows

Parsing PHP8 underlying kernel source code - array (2)
$c bucket

zend_array is as follows

Parsing PHP8 underlying kernel source code - array (2)
$c zend_array

nTableSize; Represents the size of the bucket array pointed to by arData, that is, the number of all buckets. =8

nNumUsed; Refers to the number of all used buckets, including the number of valid and invalid buckets =4

##nNumOfElements; The number of valid buckets. =4

So nNumOfElements nNumUsed =nTableSize

nTableMask; Index size. -8

nNextFreeElement; The subscript of the key of the next inserted element hash_array will always be 0 if it is not used

▏This article was published on the php Chinese website with the consent of the original author PHP Cui Xuefeng , original address: https://zhuanlan.zhihu.com/p/358354087

The above is the detailed content of Parsing PHP8 underlying kernel source code - array (2). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:zhihu.com. If there is any infringement, please contact admin@php.cn delete