Home  >  Article  >  Database  >  A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis

A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis

青灯夜游
青灯夜游forward
2021-11-05 10:31:562506browse

This article will introduce you to the dictionary, hash algorithm and ReHash principle in Redis. I hope it will be helpful to you!

A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis

Dictionaries in Redis are widely used to implement various functions of Redis, including databases and hash keys.

The underlying implementation of the dictionary is a hash table. Each dictionary has two hash tables, one is used normally and the other is used when rehash is used to expand the space. [Related recommendations: Redis video tutorial]

Dictionary structure definition

typedef struct dict {
	
	// 类型特定函数
	dictType *type;
	
	// 私有数据
	void *privdata;
	
	// 哈希表,两个元素
	dictht ht[2]
	
	// rehash时记录的索引下标,当没有rehash时,值为-1
	int rehashidx;

} dict;

==When rehash is performed, rehashidx migrates each index The entry data will be 1;==

Among them, the structure of the hash table dicttht is defined as:

typedef struct dictht {
	
	// 哈希表数组
	dictEntry **table;
	
	// 哈希表大小
	unsigned long size;
	
	// 哈希表大小掩码,用于计算索引值
	unsigned long sizenask;
	
	// 该哈希表已有节点的数量
	unsigned long uesd;

} dictht;

Among them, table is an array, and each element of the array points to a pointer of type dictEntry. The dictEntry type stores a key-value pair.

It can also be seen here that the nodes of the hash table are connected by a linked list to solve the hash conflict problem, which is the chain address method.

Hash conflict and hash algorithm

In order to achieve fast access from key to value, Redis uses a hash table to save all key-value pairs. Keycorresponds to the Key set by Redis, and valuecorresponds not to the value itself, but to the pointer pointing to the specific value. The biggest advantage of using a hash table is that you can quickly find key-value pairs with O(1) time complexity. But since it is a hash table, there will inevitably be a hash conflict problem.

Hash conflict means that when the hash value of two keys and the hash bucket are calculated, they happen to fall on the same hash bucket.

The way Redis solves hash conflicts is to use chain hashing, that is, zipper method. When multiple elements point to the same hash bucket, a linked list is used to save the corresponding data in the same hash bucket, and they are connected in turn using pointers.

Hash algorithm

When a new key-value pair is added to the dictionary, the program needs to first calculate the hash value and index based on the key-value pair. value, and then place the hash table node containing the new key-value pair at the specified index of the hash table array based on the index value.

reHash process

There is a load factor in the hash table to control the number of key-value pairs saved in the hash table. This requires a rehash operation to complete. Among them, the calculation formula of the load factor is:

// 负载因子 = 哈希表已保存的节点数量 / 哈希表大小
load_factor = ht[0].used / ht[0].size

The conditions for hash table expansion and contraction are as follows:

  • The server is not currently executing the BGSAVE command or the BGREWRITEAOF command, and the hash table The load factor is greater than or equal to 1;
  • The server is currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 5;

One of the above conditions is met, The rehash process will be executed.

If the server is executing BGSAVE or BGREWRITEAOF, Redis will create a child process of the current server process

The rehash process is roughly divided into three steps:

  • Allocate larger space to hash table 2, for example twice the current size of hash table 1;

  • Remap the data in hash table 1 and copy it to the hash In Table 2;

  • Release the space of hash table 1;

The size of the allocated space in the first step is determined by the current rehash Determined by the operation type and the number of key-value pairs in the current hash table.

  • When an expansion operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table * 2);

    Assume that the current number of key-value pairs is 4, then 4 * 2 = 8, because 8 is exactly equal to 2^3, which is exactly equal to the first value equal to 2^n, so the expansion space is 8;

  • If a shrink operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table);

Progressive reHash

When there are a large number of hash tables, if all the data is copied at once, it is likely to affect the server. Therefore, Redis performs rehash in multiple times, which is progressive rehash.

To put it simply, in the second step, Redis still processes client requests normally. When processing a request, it starts from the first index position in hash table 1 and moves this index along the way. All entries elements at the position are copied to hash table 2; when the next request is made, the entries at the next index position are copied.

This cleverly allocates the cost of a large number of copies at one time to the process of processing multiple requests, avoiding time-consuming operations and ensuring fast access to data.

Hash table operations during rehash

When performing a progressive rehash operation, dictionary deletion, search, update and other operations will be performed in two hash tables. For example, if you want to find a key in the dictionary, you will first query it in the original table. If it cannot find it, you will query it in the new table.

And the dictionary addition operation will only be saved in the new table.

For more programming-related knowledge, please visit: Introduction to Programming! !

The above is the detailed content of A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete