An in-depth introduction to the underlying data structure of redis-Redis-php.cn

An in-depth introduction to the underlying data structure of redis

1. Overview

I believe that all students who have used Redis know very well that Redis is based on key-value pairs. A distributed storage system that is similar to Memcached, but is a high-performance key-value database that is better than Memcached.

is described in "Redis Design and Implementation":

Every key-value pair (key-value) in the Redis database is Composed of objects:

The database key is always a string object;

The database value can be a string object, a list object (list), One of the five types of objects: hash, set, and sort set objects.

Why do we say that Redis is better than Memcached? Because the emergence of Redis has enriched the key-value storage in memcached. In some cases, it can play a very good supplementary role to the relational database, and these data All types support push/pop, add/remove, intersection, union, difference, and richer operations, and these operations are all atomic.

What we are discussing today is not the data type of value in Redis, but their specific implementation-the underlying data type.

The underlying data structure of Redis has the following data types:

1, simple dynamic string

2, linked list

3, dictionary

4, Jump table

5, Integer set

6, Compressed list

7, Object

2, Simple dynamic String (simple dynamic string) SDS

2.1 Overview

Redis is an open source key-value database written in ANSI C language. We may subjectively think that Redis The string is represented by the traditional string in C language, but it is not the case. Redis does not directly use the traditional string representation in C language. Instead, it builds its own called simple dynamic string (simple dynamic string SDS). Abstract type, and use SDS as the default string representation of Redis:

redis>SET msg "hello world"
OK

Copy after login

Set a new key-value pair with key= msg, value = hello world. Their underlying data structure will be:

The key (key) is a string object, and the underlying implementation of the object is an SDS that stores the string "msg";

The value (value) is also a string object, and the underlying implementation of the object It is an SDS that stores the string "hello world"

From the above example, we can intuitively see what kind of data type the string we create when we usually use redis. In addition to being used to save strings, SDS is also used as the AOF buffer in the buffer AOF module.

2.2 Definition of SDS

The structure of dynamic string defined in Redis:

/*  
 * 保存字符串对象的结构  
 */  
struct sdshdr {  
      
    // buf 中已占用空间的长度  
    int len;  
  
    // buf 中剩余可用空间的长度  
    int free;  
  
    // 数据空间  
    char buf[];  
};

Copy after login

An in-depth introduction to the underlying data structure of redis

1, len Variable, used to record the length of space that has been used in buf (here it is pointed out that the length of Redis is 5)

2. Free variable, used to record the free space in buf (the space is allocated for the first time, generally there is no free space, When modifying the string, there will be remaining space)

3. buf character array, used to record our string (recording Redis)

2.3 SDS and C The difference between strings

Traditional C string uses a string array of length N 1 to represent a string of length N. In this way, it is necessary to obtain the string length, string expansion and other operations. time inefficiency. C language uses this simple string representation method, which cannot meet Redis's requirements for string security, efficiency and functionality

2.3.1 Get the string length (SDS O(1)/C String O(n))

Traditional C string uses a string array of length N 1 to represent a string of length N, so in order to obtain the length of a C string, it must be traversed the entire string.

Different from C strings, the data structure of SDS has variables specifically used to store the length of the string. We can directly know the length of the string by getting the value of the len attribute.

An in-depth introduction to the underlying data structure of redis

2.3.2 Prevent buffer overflow

C string does not record the length of the string. In addition to the high complexity of acquisition, it can also easily lead to buffering. Area overflow.

Assume that the program has two strings s1 and s2 that are adjacent to each other in the memory. Among them, s1 saves the string "redis" and s2 saves the string "MongoDb":

An in-depth introduction to the underlying data structure of redis

If we now modify the content of s1 to redis cluster, but forget to allocate enough space for s1 again, the following problems will occur:

An in-depth introduction to the underlying data structure of redis

We can see that the original content in s2 has been occupied by the content of S1, and s2 is now a cluster instead of "Mongodb".

Redis 中SDS 的空间分配策略完全杜绝了发生缓冲区溢出的可能性：

当我们需要对一个SDS 进行修改的时候，redis 会在执行拼接操作之前，预先检查给定SDS 空间是否足够，如果不够，会先拓展SDS 的空间，然后再执行拼接操作

An in-depth introduction to the underlying data structure of redis

2.3.3 减少修改字符串时带来的内存重分配次数　　　

C语言字符串在进行字符串的扩充和收缩的时候，都会面临着内存空间的重新分配问题。

1. 字符串拼接会产生字符串的内存空间的扩充，在拼接的过程中，原来的字符串的大小很可能小于拼接后的字符串的大小，那么这样的话，就会导致一旦忘记申请分配空间，就会导致内存的溢出。

2. 字符串在进行收缩的时候，内存空间会相应的收缩，而如果在进行字符串的切割的时候，没有对内存的空间进行一个重新分配，那么这部分多出来的空间就成为了内存泄露。

举个例子：我们需要对下面的SDS进行拓展，则需要进行空间的拓展，这时候redis 会将SDS的长度修改为13字节，并且将未使用空间同样修改为1字节

An in-depth introduction to the underlying data structure of redis

因为在上一次修改字符串的时候已经拓展了空间，再次进行修改字符串的时候会发现空间足够使用，因此无须进行空间拓展

An in-depth introduction to the underlying data structure of redis

通过这种预分配策略，SDS将连续增长N次字符串所需的内存重分配次数从必定N次降低为最多N次

2.3.4 惰性空间释放

我们在观察SDS 的结构的时候可以看到里面的free 属性，是用于记录空余空间的。我们除了在拓展字符串的时候会使用到free 来进行记录空余空间以外，在对字符串进行收缩的时候，我们也可以使用free 属性来进行记录剩余空间，这样做的好处就是避免下次对字符串进行再次修改的时候，需要对字符串的空间进行拓展。

然而，我们并不是说不能释放SDS 中空余的空间，SDS 提供了相应的API，让我们可以在有需要的时候，自行释放SDS 的空余空间。

通过惰性空间释放，SDS 避免了缩短字符串时所需的内存重分配操作，并未将来可能有的增长操作提供了优化

2.3.5 二进制安全

C 字符串中的字符必须符合某种编码，并且除了字符串的末尾之外，字符串里面不能包含空字符，否则最先被程序读入的空字符将被误认为是字符串结尾，这些限制使得C字符串只能保存文本数据，而不能保存想图片，音频，视频，压缩文件这样的二进制数据。

但是在Redis中，不是靠空字符来判断字符串的结束的，而是通过len这个属性。那么，即便是中间出现了空字符对于SDS来说，读取该字符仍然是可以的。

例如：

An in-depth introduction to the underlying data structure of redis

2.3.6 兼容部分C字符串函数

虽然SDS 的API 都是二进制安全的，但他们一样遵循C字符串以空字符串结尾的惯例。

2.3.7 总结

An in-depth introduction to the underlying data structure of redis

3、链表

3.1 概述

链表提供了高效的节点重排能力，以及顺序性的节点访问方式，并且可以通过增删节点来灵活地调整链表的长度。

链表在Redis 中的应用非常广泛，比如列表键的底层实现之一就是链表。当一个列表键包含了数量较多的元素，又或者列表中包含的元素都是比较长的字符串时，Redis 就会使用链表作为列表键的底层实现。　

3.2 链表的数据结构

每个链表节点使用一个 listNode结构表示（adlist.h/listNode）：

typedef struct listNode{
      struct listNode *prev;
      struct listNode * next;
      void * value;  
}

Copy after login

多个链表节点组成的双端链表：

1An in-depth introduction to the underlying data structure of redis

我们可以通过直接操作list 来操作链表会更加方便：

typedef struct list{
    //表头节点
    listNode  * head;
    //表尾节点
    listNode  * tail;
    //链表长度
    unsigned long len;
    //节点值复制函数
    void *(*dup) (void *ptr);
    //节点值释放函数
    void (*free) (void *ptr);
    //节点值对比函数
    int (*match)(void *ptr, void *key);
}

Copy after login

list 组成的结构图：

1An in-depth introduction to the underlying data structure of redis

3.3 链表的特性

双端：链表节点带有prev 和next 指针，获取某个节点的前置节点和后置节点的时间复杂度都是O（N）

无环：表头节点的 prev 指针和表尾节点的next 都指向NULL，对立案表的访问时以NULL为截止

表头和表尾：因为链表带有head指针和tail 指针，程序获取链表头结点和尾节点的时间复杂度为O(1)

长度计数器：链表中存有记录链表长度的属性 len

多态：链表节点使用 void* 指针来保存节点值，并且可以通过list 结构的dup 、 free、 match三个属性为节点值设置类型特定函数。

4、字典

4.1 概述

字典，又称为符号表（symbol table）、关联数组（associative array）或映射（map），是一种用于保存键值对的抽象数据结构。　

在字典中，一个键（key）可以和一个值（value）进行关联，字典中的每个键都是独一无二的。在C语言中，并没有这种数据结构，但是Redis 中构建了自己的字典实现。

举个简单的例子：

redis > SET msg "hello world"
OK

Copy after login

创建这样的键值对（“msg”，“hello world”）在数据库中就是以字典的形式存储

4.2 字典的定义

4.2.1 哈希表

Redis 字典所使用的哈希表由 dict.h/dictht 结构定义：

typedef struct dictht {
   //哈希表数组
   dictEntry **table;
   //哈希表大小
   unsigned long size;

   //哈希表大小掩码，用于计算索引值
   unsigned long sizemask;
   //该哈希表已有节点的数量
   unsigned long used;
}

Copy after login

一个空的字典的结构图如下：

1An in-depth introduction to the underlying data structure of redis

我们可以看到，在结构中存有指向dictEntry 数组的指针，而我们用来存储数据的空间既是dictEntry

4.2.2 哈希表节点（ dictEntry ）

dictEntry 结构定义：

typeof struct dictEntry{
   //键
   void *key;
   //值
   union{
      void *val;
      uint64_tu64;
      int64_ts64;
   }
   struct dictEntry *next;

}

Copy after login

在数据结构中，我们清楚key 是唯一的，但是我们存入里面的key 并不是直接的字符串，而是一个hash 值，通过hash 算法，将字符串转换成对应的hash 值，然后在dictEntry 中找到对应的位置。

这时候我们会发现一个问题，如果出现hash 值相同的情况怎么办？Redis 采用了链地址法：

1An in-depth introduction to the underlying data structure of redis

当k1 和k0 的hash 值相同时，将k1中的next 指向k0 想成一个链表。

4.2.3 字典

typedef struct dict {
    // 类型特定函数
    dictType *type;
    // 私有数据
    void *privedata;
    // 哈希表
    dictht  ht[2];
    // rehash 索引
    in trehashidx;

}

Copy after login

type 属性和privdata 属性是针对不同类型的键值对，为创建多态字典而设置的。

ht 属性是一个包含两个项（两个哈希表）的数组

普通状态下的字典：

An in-depth introduction to the underlying data structure of redis

4.3 解决哈希冲突

在上述分析哈希节点的时候我们有讲到：在插入一条新的数据时，会进行哈希值的计算，如果出现了hash值相同的情况，Redis 中采用了连地址法（separate chaining）来解决键冲突。

每个哈希表节点都有一个next 指针，多个哈希表节点可以使用next 构成一个单向链表，被分配到同一个索引上的多个节点可以使用这个单向链表连接起来解决hash值冲突的问题。

举个例子：

现在哈希表中有以下的数据：k0 和k1

An in-depth introduction to the underlying data structure of redis

我们现在要插入k2，通过hash 算法计算到k2 的hash 值为2，即我们需要将k2 插入到dictEntry[2]中：

An in-depth introduction to the underlying data structure of redis

在插入后我们可以看到，dictEntry指向了k2，k2的next 指向了k1，从而完成了一次插入操作（这里选择表头插入是因为哈希表节点中没有记录链表尾节点位置）

4.4 Rehash

随着对哈希表的不断操作，哈希表保存的键值对会逐渐的发生改变，为了让哈希表的负载因子维持在一个合理的范围之内，我们需要对哈希表的大小进行相应的扩展或者压缩，这时候，我们可以通过 rehash（重新散列）操作来完成。

4.4.1 目前的哈希表状态：

我们可以看到，哈希表中的每个节点都已经使用到了，这时候我们需要对哈希表进行拓展。

An in-depth introduction to the underlying data structure of redis

4.4.2 为哈希表分配空间

哈希表空间分配规则：

如果执行的是拓展操作，那么ht[1] 的大小为第一个大于等于ht[0] 的2的n次幂

如果执行的是收缩操作，那么ht[1] 的大小为第一个大于等于ht[0] 的2的n次幂

因此这里我们为ht[1] 分配空间为8，

An in-depth introduction to the underlying data structure of redis

4.4.3 数据转移

Transfer the data in ht[0] to ht[1]. During the transfer process, the hash value of the data in the hash table node needs to be recalculated.

Results after data transfer ：

An in-depth introduction to the underlying data structure of redis

4.4.4 Release ht[0]

Release ht[0], and then set ht[1] to ht[0], Finally, allocate a blank hash table to ht[1]:

An in-depth introduction to the underlying data structure of redis

4.4.5 Progressive rehash

As we mentioned above, during expansion or compression When , you can directly rehash all key-value pairs into ht[1], because the amount of data is relatively small. In the actual development process, this rehash operation is not completed once and centrally, but is completed multiple times and progressively.

Detailed steps of progressive rehash:

1. Allocate space for ht[1], so that the dictionary holds two hash tables ht[0] and ht[1] at the same time

2. Maintain an index counter variable rehashidx at what time, and set its value to 0, indicating the start of rehash

3. During rehash, each time a CRUD operation is performed on the dictionary In addition to performing the specified operations, the program will also rehash the data in ht[0] to the ht[1] table, and add one to rehashidx

4. When all data in ht[0] is transferred When entering ht[1], set rehashidx to -1, indicating the end of rehash

The advantage of using progressive rehash is that it adopts a divide-and-conquer approach and avoids the huge amount of calculation caused by centralized rehash.

The above is the detailed content of An in-depth introduction to the underlying data structure of redis. For more information, please follow other related articles on the PHP Chinese website!