Redis partition implementation-Redis-php.cn

Partitioning is the process of splitting data into multiple Redis instances, so each instance only saves a subset of keys. This article will introduce to you how redis implements partitioning.

Redis partition implementation

Why do we need to partition? What was the motivation for partition? Generally speaking, the benefits of Redis partitioning are roughly as follows:

1. Performance improvement. The network I/O capabilities and computing resources of single-machine Redis are limited, and requests are distributed to multiple machines. Making full use of the computing power and network bandwidth of multiple machines helps improve the overall service capabilities of Redis.

2. Horizontal expansion of storage. Even if the service capabilities of Redis can meet application needs, as the storage data increases, a single machine is limited by the storage capacity of the machine itself, and the data is dispersed to multiple machines. Upper storage enables the Redis service to scale horizontally.

In general, partitioning makes our original problem of being limited by the hardware resources of a single computer no longer a problem. Not enough storage? Not enough computing resources? Not enough bandwidth? We can all solve these problems by adding more machines.

Redis Partition Basics

There are many specific strategies for partitioning in actual applications. For example, suppose we already have a set of four Redis instances, namely R0, R1, R2, R3. In addition, we have a batch of keys representing users, such as: user:1, user:2,...and so on. The number after "user:" represents the user's ID. What we need to do These keys are stored in four different Redis instances.

How to do it? The simplest way is range partitioning. Let's take a look at how to do it based on range partitioning.

Range partitioning

The so-called range partitioning is to map all keys in a range to the same Redis instance. Adding the data set is still the user data mentioned above. , the specific method is as follows:

We can map user data with user IDs from 0 to 10000 to R0 instances, and map objects with user IDs from 10001 to 20000 to R1 instances, and so on.

Although this method is simple, it is very effective in practical applications, but there are still problems:

1. We need a table, which is used to store Mapping relationship between user ID range and Redis instance. For example, user ID 0-10000 is mapped to R0 instance...

2. We not only need to maintain this table, but we also need such a table for each object type. For example, we are currently storing user information. If we are storing order information, we will Another mapping table needs to be created.

3. What if the key of the data we want to store cannot be divided according to the range. For example, our key is a set of uuid. At this time, it is difficult to use range partitioning.

Hash partition

An obvious advantage of hash partition compared to range partition is that hash partition is suitable for any form of key, unlike range partitioning. The form of key is object_name:, and the partitioning method is also very simple. It can be expressed by a formula:

id=hash(key)%N

Copy after login

where id represents the number of the Redis instance. The formula describes the first step based on key and a hash function. (such as crc32 function) calculates a numeric value. Following the above example, the first key we want to process is user:1, and the result of hash (user:1) is 93024922.

Then the hash result is modulo. The purpose of modulo is to calculate a value between 0 and 3, so this value can be mapped to one of our Redis instances. For example, if the result of 93024922%4 is 2, we will know that foobar will be stored on R2.

Different partition implementations

Partitions can be implemented in different parts of the redis software stack. Let’s take a look at the following:

Client implementation

Client implementation means that the key is determined on the redis client in which Redis instance it will be stored in, as shown in the figure below:

Redis partition implementation

Proxy implementation

Proxy implementation means that the client sends the request to the proxy server. The proxy server implements the Redis protocol, so the proxy server can proxy the communication between the client and the Redis server. The proxy server forwards the client's request to the correct Redis instance through the configured partition schema, and returns the feedback message to the client.

The schematic diagram of the agent's implementation of Redis partition is as follows:

Redis partition implementation

Query routing

Query routing is an implementation of Redis Cluster A Redis partitioning method:

Redis partition implementation

During the query routing process, we can randomly send the query request to any Redis instance. This Redis instance is responsible for forwarding the request to the correct In the Redis instance. Redis cluster implements a hybrid that cooperates with the client for query routing.

Disadvantages of Redis partition

Although Redis partitioning is so far so good so far, Redis partitioning has some fatal shortcomings, which causes some Redis functions to not work well in a partitioned environment. Let’s take a look:

1. Multi-key operations are not supported. For example, the keys we want to operate in batches are mapped to different Redis instances.

2. Multi-key Redis transactions are not supported.

3. The minimum granularity of partitioning is the key, so we cannot map a large data set associated with a key to different instances.

4. When applying partitioning, data processing is very complex. For example, we need to process multiple rdb/aof files and gather files distributed in different instances for backup.

5. Adding and deleting machines is very complex. For example, Redis cluster supports almost runtime transparent rebalancing required to add or reduce machines. However, methods such as client and proxy partitioning are not supported. of this function.

Persistent storage or caching

Although data partitioning is conceptually the same for Redis, whether it is data persistent storage or caching, however, for data Persistent storage still has a big limitation.

When we use Redis as persistent storage, each key must always be mapped to the same Redis instance. When Redis is used as a cache, for this key, if one instance cannot be used, this key can also be mapped to other instances.

Consistent hashing implementations usually make it possible to map a key to another instance when the instance to which the key is mapped becomes unavailable. Similarly, if a machine is added, part of the keys will be mapped to the new machine. Two points we need to understand are as follows:

1. If Redis is used as a cache, and the requirements are easy Adding or removing machines is very simple using consistent hashing.

2. If Redis is used as (persistent) storage, a fixed key-to-instance mapping is required, so we can no longer flexibly add or delete machines. Otherwise, we need the system to be able to rebalance when adding or deleting machines, which is currently supported by Redis Cluster.

Pre-Sharding

Through the above introduction, we know that there are problems with the application of Redis partition. Unless we only use Redis as a cache, it will be difficult to add machines or Deleting a machine is very troublesome.

However, usually our Redis capacity changes are very common in practical applications. For example, I need 10 Redis machines today, and I may need 50 machines tomorrow.

Given that Redis is a very lightweight service (each instance only occupies 1M), a simple solution to the above problem is:

We can open multiple Even though the Redis instance is a physical machine, we can start multiple instances at the beginning. We can choose some instances, such as 32 or 64 instances, as our working cluster. When one physical machine does not have enough storage, we can move the general instances to our second physical machine and pair them in sequence. We can ensure that the number of Redis instances in the cluster remains unchanged and achieve the purpose of expanding the machine.

How to move a Redis instance? When we need to move the Redis instance to an independent machine, we can do it through the following steps:

1. Start a new Redis instance on the new physical machine.

2. Use the new physical machine as the slave machine to be moved.

3. Stop the client.

4. Update the IP address of the Redis instance to be moved.

5. Send the SLAVEOF ON ONE command to the slave machine.

6. Use the new IP to start the Redis client.

7. Close the Redis instance that is no longer in use.

For more redis knowledge, please pay attention to the redis introductory tutorial column.

The above is the detailed content of Redis partition implementation. For more information, please follow other related articles on the PHP Chinese website!