Detailed explanation of redis compressed linked list ziplist source code-JS Tutorial-php.cn

The compressed list (ziplist) is a list composed of a series of specially encoded memory blocks. It plays a very important role in Redis data storage optimization. This article mainly shares with you the ziplist, a data structure that is used a lot in redis. It is no exaggeration to say that this data structure is ubiquitous in redis. In addition to linked lists, many other data structures also use it for transition, such as the SortedSet mentioned in the previous article. Not much to say below, let’s take a look at the detailed introduction.

1. Introduction to compressed linked list ziplist data structure

First, look at the structure of ziplist as a whole, as shown below:

Detailed explanation of redis compressed linked list ziplist source code

Compressed linked list ziplist structure diagram

It can be seen that there are many fields and different byte sizes, but this is the essence of compressed linked lists. Let's summarize them one by one.

Field	Meaning
zlbytes	This field is the first field of the compressed linked list, is an unsigned integer, and occupies 4 bytes. Used to represent the number of bytes occupied by the entire compressed linked list (including itself).
zltail	Unsigned integer, occupies 4 bytes. Used to store the offset from the head of the compressed linked list to the last entry (not the tail element zlend), used in scenarios where you can quickly jump to the end of the linked list.
zllen	Unsigned integer type, occupying 2 bytes. Used to store the total number of entries contained in the compressed linked list.
zlend	Special entry used to indicate the end of the compressed linked list. Occupies one byte and the value is always 255.

It is summarized as the head and tail of the ziplist, and the most important entry field is summarized below.

Generally speaking, an entry consists of three fields: prevlen, encoding, and entry-data. However, when the entry is a small integer, the entry-data field will be omitted based on the encoding. The following is a summary:

The first is the field prevlen: it represents the length of the previous entry. There are two encoding methods.

When the length is less than 255 bytes, it is stored in one byte.
When the length is greater than or equal to 255, five bytes are used for storage, and the first byte will be set to 255, indicating that the length of the previous entry is represented by the following four bytes.

Then there is the field encoding: it will use different encoding methods depending on the content of the current element, as follows:

1. If the element content is a string, the encoding values are:

00xx xxxx: Starting with 00 indicates that the length of the string is represented by 6 bits.

01xx xxxx |

1000 0000 | xxxx xxxx | xxxx xxxx | xxxx xxxx |

2. If the element content is a number, the encoding values are:

1100 0000: Indicates that the number occupies the next 2 bytes.

1101 0000: Indicates that the number occupies the next 4 bytes.

1110 0000: Indicates that the number occupies the next 8 bytes.

1111 0000: Indicates that the number occupies the next 3 bytes.

1111 1110: Indicates that the number occupies the next byte.

1111 1111: Indicates the last element in the compressed linked list (special encoding).

1111 xxxx : Indicates that only the last 4 digits are used to represent integers from 0 to 12. Since 0000, 1110 and 1111 are already occupied, that is to say, the four digits of xxxx here can only represent 0001 to 1101. When converted to decimal, they are the numbers 1 to 13. However, redis stipulates that it is used to represent 0~12, so when encountering this encoding, we need to take out the last four digits and subtract 1 to get the correct value.

Finally, there is the field entry-data: if the value of the element is a string, the value of the element itself is saved. If the value of the element is a very small number (0~12 according to the above encoding rules), there is no such field.

The encoding of compressed linked lists is very complicated, but this is also the essence of this data structure. Let’s take a look at an example:

Note: This example is

//由元素2，5组成的压缩链表
[0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
 |  |  | | | |
 zlbytes zltail entries "2" "5" end
//字符串"Hello World"编码后的内容
[02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]

Copy after login

mentioned in the redis source code The above is a compressed linked list consisting of two elements, 2 and 5, expressed in hexadecimal.

First of all, the first four bytes represent the number of bytes occupied by the entire ziplist. Because redis uses little-endian storage, 15 bytes are represented as 0f 00 00 00.
The next 4 bytes represent the offset of the last element, which is the offset from the ziplist header (zlbytes) to the last element (note: not the tail node). It is also stored in little endian, so it is expressed as 0c 00 00 00.
Next is the zllen field consisting of two bytes that represents the number of elements. In our example, there are two elements, plus little-endian storage, so it is expressed as 02 00.
Next is the element itself. First, a variable-length word end represents the length of the previous element. 2 is the first element. The size of the previous element is 0, so it occupies a byte of 00. According to the encoding rules we mentioned above, elements 2 and 5 belong to numbers between 0 and 12, and you need to use 1111 Encoded in xxxx format, 2 is converted into binary as 0010, plus 1 is added to 0011 (the reason for adding 1 has been explained above), and the combined element 2 is represented as 00 f3. Similarly, element 5 is represented as 02 f6.
The last element is the tail element, using the unoccupied code 1111 1111, which is 255.

Next, we insert a string element Hello at the end of this compressed linked list World, first look at how to encode the string. According to the agreed encoding rules, first use bytes to represent the length of the previous element. The previous element here is 5, which occupies a total of two bytes, so first use a word The section represents the length of the previous element, 02. Next is the encoding of the string. The length of the string we want to add is 11 (spaces are also counted). When converted to binary, it is 1011. According to the encoding rules of the string, use 0000 1011 means that when converted to hexadecimal, it is 0b. Finally, adding the ASCII code of our string itself, we get the encoding of the string: [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64].

At this time, the entire compressed linked list becomes:

[0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [02 0b 48 65 6c 6c 6f 20 57 6f 72 6c 64] [ff]
 |  |  | | |   |   |
 zlbytes zltail entries "2" "5"   "Hello World"  end

Copy after login

2. Compression linked list ziplist command source code analysis

After understanding the above encoding rules, let's take a look at some of the operation source codes of the compressed linked list ziplist. This article summarizes the basic principles of compressed linked lists through the four operations of creating a compressed linked list, inserting elements, deleting elements, and searching for elements.

The first is to create:

//定义由zlbytes，zltail跟zllen组成的压缩链表的头大小
#define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t))
//创建一个压缩链表，并且返回指向该链表的指针
unsigned char *ziplistNew(void) {
 //这里之所以+1是因为尾元素占用一个字节，这也是一个压缩链表最小尺寸
 unsigned int bytes = ZIPLIST_HEADER_SIZE+1;
 //分配内存
 unsigned char *zl = zmalloc(bytes);
 //设置链表大小
 ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
 //设置最后一个元素的偏移量
 ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
 //设置元素个数
 ZIPLIST_LENGTH(zl) = 0;
 //设置尾元素（上面只是申请空间）
 zl[bytes-1] = ZIP_END;
 return zl;
}

Copy after login

The logic of creating a compressed linked list is very simple, which is to apply for a fixed space containing the head and tail nodes, and then initialize the linked list context.

与创建相比，添加元素的源码就非常冗长了，为了便于理解，在看源码之前我们先自己梳理一下添加元素的逻辑。

首先我们要找到指定插入位置的前一个元素的大小，因为该属性是新元素的组成部分之一。
然后我们要对当前元素进行编码来获得相应的encoding字段跟实际元素值的字段。
新插入元素的后继元素的prevlen字段要更新，因为它前面的元素已经改变。这里可能引起级联更新（删除元素也有该问题），原因就是prevlen字段大小是可变的。

上面三步是核心步骤，其余的还有更新尾节点偏移量，修改链表元素个数等操作。当然，由于压缩链表是基于数组实现的，因此在插入或删除元素的时候内存拷贝也是必不可少的。

总结好上面的步骤以后，我们开始一步一步分析源码，比较长，慢慢看：

//四个参数依次是：压缩链表，插入位置（新元素插入p元素后面），元素值，元素长度
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
 //这里是保存当前链表的长度
 size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen;
 unsigned int prevlensize, prevlen = 0;
 size_t offset;
 int nextdiff = 0;
 unsigned char encoding = 0;
 long long value = 123456789;
 zlentry tail;

 //1. 这段逻辑目的就是获取前置元素的长度
 if (p[0] != ZIP_END) {
 //如果插入位置的元素不是尾元素,则获取该元素的长度
 //这里为了后面使用方便进行了拆分，prevlensize保存encoding字段的长度，prevlen保存元素本身的长度
 ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
 } else {
 //如果插入位置的元素是尾元素，那么需要把新元素插入链表尾端
 //获取到链表最后一个元素（注：最后一个元素不等于尾元素）
 unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);

 if (ptail[0] != ZIP_END) {
  //如果最后一个元素不是尾元素，则该元素为新元素的前置元素，获取该元素长度
  prevlen = zipRawEntryLength(ptail);
 }
 //否则说明链表还没有任何元素，即新元素的前置元素长度为0
 }

 //2. 对新元素进行编码，获取新元素的总大小
 if (zipTryEncoding(s,slen,&value,&encoding)) {
 //如果是数字，则按数字进行编码
 reqlen = zipIntSize(encoding);
 } else {
 //元素长度即为字符串长度
 reqlen = slen;
 }
 //新元素总长度为值的长度加上前置元素跟encoding元素的长度
 reqlen += zipStorePrevEntryLength(NULL,prevlen);
 reqlen += zipStoreEntryEncoding(NULL,encoding,slen);

 //如果插入的位置不是链表尾，则要对新元素的后续元素的prevlen字段进行判断
 //根据上面的编码规则，该字段可能需要扩容
 int forcelarge = 0;
 nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
 if (nextdiff == -4 && reqlen <p>
	分析完插入元素的逻辑，长舒一口气，真的很长，细节也很多。</p><p>
	接下来在再看下删除元素的过程，与添加相比，删除相对要简单一些，清空当前元素以后，需要把后继元素一个一个拷贝上来（这也是数组跟链表两个数据结构的差别），然后注意是否需要级联更新，上代码：</p><pre class="brush:php;toolbar:false">//参数依次为：压缩链表，删除元素的其实位置，删除元素的个数
unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
 unsigned int i, totlen, deleted = 0;
 size_t offset;
 int nextdiff = 0;
 zlentry first, tail;
 //读取p指向的元素保存在first中
 zipEntry(p, &first);
 for (i = 0; p[0] != ZIP_END && i  0) {
  if (p[0] != ZIP_END) {
   //判断元素大小是否有改变
   nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);
   //修改删除元素之后的元素的prevlen字段
   p -= nextdiff;
   zipStorePrevEntryLength(p,first.prevrawlen);
   //更新末尾元素的偏移量
   ZIPLIST_TAIL_OFFSET(zl) =intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen);
   //当删除元素的后继元素不止有一个时，新的末尾元素偏移量需要加上nextdiff
   zipEntry(p, &tail);
   if (p[tail.headersize+tail.len] != ZIP_END) {
    ZIPLIST_TAIL_OFFSET(zl) =
     intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
   }
   //把后面剩余的元素移动至前面
   memmove(first.p,p,
    intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1);
  } else {
   //直接删除到链表末尾，因此不需要内存拷贝，只需修改最后一个元素的偏移量
   ZIPLIST_TAIL_OFFSET(zl) =
    intrev32ifbe((first.p-zl)-first.prevrawlen);
  }
  //resize数组大小
  offset = first.p-zl;
  zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff);
  //修改链表元素个数
  ZIPLIST_INCR_LENGTH(zl,-deleted);
  p = zl+offset;
  //nextdiff != 0表示元素大小发生变化，需要进行级联更新
  if (nextdiff != 0)
   zl = __ziplistCascadeUpdate(zl,p);
 }
 return zl;
}

Copy after login

最后我们看下元素的查找操作：

//参数依次为：压缩链表，要查找元素的值，要查找元素的长度，每次比较之间跳过的元素个数
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
 int skipcnt = 0;
 unsigned char vencoding = 0;
 long long vll = 0;
 //只要还没到尾元素就不断循环
 while (p[0] != ZIP_END) {
  unsigned int prevlensize, encoding, lensize, len;
  unsigned char *q;
  //查询链表当前元素的prevlen字段
  ZIP_DECODE_PREVLENSIZE(p, prevlensize);
  //查询链表当前元素的encoding字段
  ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
  q = p + prevlensize + lensize;
  //已到达需要比较的元素位置
  if (skipcnt == 0) {
   //如果链表中的当前元素时字符串
   if (ZIP_IS_STR(encoding)) {
    //跟要查找的字符串进行比较
    if (len == vlen && memcmp(q, vstr, vlen) == 0) {
     //匹配成功，则要查找元素的指针
     return p;
    }
   } else {
    //如果当前元素为数字且vencoding为0
    if (vencoding == 0) {
     //尝试对要查找的值进行数字编码
     if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
      //如果编码失败，则说明要查找的元素根本不是数字
      //然后把vencoding设置为最大值，起一个标记作用
      //也就是说后面就不用尝试把要查找的值编码成数字了
      vencoding = UCHAR_MAX;
     }
     assert(vencoding);
    }
    //如果vencoding != UCHAR_MAX，则说明要查找的元素成功编码为数字
    if (vencoding != UCHAR_MAX) {
     //按数字取出当前链表中的元素
     long long ll = zipLoadInteger(q, encoding);
     if (ll == vll) {
      //如果两个数字相等，则返回元素指针
      return p;
     }
    }
   }
   //重置需要跳过的元素个数
   skipcnt = skip;
  } else {
   //跳过元素
   skipcnt--;
  }
  //遍历下个元素
  p = q + len;
 }
 //遍历完整个链表，没有找到元素
 return NULL;
}

Copy after login

到这里就把压缩链表的创建，添加，删除，查找四个基本操作原理总结完了。

三、压缩链表ziplist数据结构总结

压缩链表ziplist在redis中的应用非常广泛，它算是redis中最具特色的数据结构了。该数据结构的精髓其实就在于文章第一部分总结的编码规则，先理清楚这部分内容，后面的源码可以简单看下加深理解。

不得不说源码部分着实有点冗长，确实需要耐心，我自己在读的过程中也很头大。如果对源码感兴趣的话，建议按我的方法，先自己梳理某个操作（例如上面提到的插入元素）需要做哪些事情，然后再看代码可能会更好理解一些。