This article brings you relevant knowledge about Redis, which mainly introduces issues related to consistent hashing and hash slots. If expansion occurs or nodes are lost, you will encounter a large number of problems. Data migration issues, consistent hashing and hash slots can avoid this problem. I hope it will be helpful to everyone.
Recommended learning: Redis learning tutorial
If we now have x cache devices, we are deciding which one to put the data in When caching on the device, you can key%x, but if expansion occurs or the node is lost, you need key%(x±y). This will encounter a lot of data migration problems. Consistent hashing and hash slots can avoid this problem. .
Ordinary hash is the remainder of the number of servers, consistent hash is the remainder of a specific number (2^32), which will not change due to the number of servers. First, we take the remainder of the server's IP or other unique identifier to get a value. This value is the server's position on the hash ring. Then we hash the object to be placed in the server to get a value. We replace the hash with the corresponding server and find the value. If there is no server in the location, check whether the server in the next location knows to find a server that can be stored.
According to the commonly used hash algorithm, hash the corresponding key into a space with 2 to the power of 32 nodes, that is, 0 ~ (2 of 32)-1 in the digital space. We can imagine this thing as biting its tail, forming a closed loop.
Now that the ring is in place, we need to put the server on the ring. We can do this according to the IP address of the server. Get the number and other unique identifiers, hash them and put them on the ring.
When we need to put a piece of data on the server, we first need to calculate the hash value of the data. Then take the remainder. If the remainder value has a corresponding server on the ring, put it directly. If not, search backwards.
So finally data1 is in redis1 and data2 is in redis2. When we obtain data, we also perform the same process, calculate the hash value of the key, and then obtain the stored server according to the same rules.
If a redis node hangs up now, then the data in other nodes is still there, and the data in the original node is still there. The data will be redistributed to the next node.
If a new server RedisNeo is added to the environment, RedisNeo is mapped to the ring through the hash algorithm, and according to the clockwise migration rule, the data with the previous hash value between Redis2 and RedisNeo will be migrated to RedisNeo (below) In the figure, RedisNeo is next to Redis2), and other objects still maintain their original storage locations. Through the analysis of the addition and deletion of nodes, the consistent hashing algorithm maintains monotonicity while minimizing data migration. Such an algorithm is very suitable for distributed clusters and avoids a large amount of data migration. , reducing the pressure on the server.
So after redisNeo is added, data3 goes into redisNeo.
So far, consistent hashing can be considered completed, but there is one problem that still needs to be solved, and that is balance. From the figure below, we can see that when there are relatively few server nodes, a problem will arise, that is, a large amount of data will inevitably be concentrated on one node. For example, if you only have two nodes, one at 1 and the other at 10, then it will be very difficult. Obviously the pressure on node 1 is infinite, because only those with hash values between [2,10] will go to node 10, and the others will all go to node 1. In order to solve this data skew problem, consistent hashing The algorithm introduces a virtual node mechanism, that is, multiple hashes are calculated for each service node, and one service node is placed at each calculation result position, which is called a virtual node. The specific method can be to first determine the number of virtual nodes associated with each physical node, and then add the number after the IP or host name. At the same time, the data positioning algorithm remains unchanged, except for the mapping of virtual nodes to actual nodes.
The hash slot is used in the redis cluster cluster scheme. The redis cluster cluster does not use a consistent hash scheme, but uses hashing in data sharding. Slots are used for data storage and reading. Redis cluster uses hash slots of data shards for data storage and data reading. The redis cluster has a total of 2^14 (16384) slots. All master nodes will have a slot area such as 0~1000. The number of slots can be migrated. The slave node of the master node does not assign slots and only has read permissions. But note that in the code, the redis cluster performs read and write operations on the master node. It is not the slave node for reading and the master node for writing. When a redis cluster is created for the first time, 16384 slots are evenly distributed by the master node.
Compared with consistent hashing, you need to manually allocate hash slots when expanding and shrinking, and when deleting the master node, you must give its slave nodes and hash slots to other masters. Node; the hash slot is based on the value of CRC-16 (key) 384 to determine which slot it belongs to.
Recommended learning: Redis tutorial
The above is the detailed content of Redis cache learning consistent hash and hash slot. For more information, please follow other related articles on the PHP Chinese website!