Home  >  Article  >  Database  >  20 questions that must be mastered in Redis, come and collect them! !

20 questions that must be mastered in Redis, come and collect them! !

青灯夜游
青灯夜游forward
2021-10-19 10:31:241672browse

This article will share with you 20 Redis issues that you must know and master. I hope it will be helpful to you. Come and collect it!

20 questions that must be mastered in Redis, come and collect them! !

What is Redis?

Redis (Remote Dictionary Server) is a high-performance non-relational key-value database written in C language. Unlike traditional databases, Redis data is stored in memory, so the read and write speed is very fast, and it is widely used in caching. Redis can write data to disk, ensuring data security and not loss, and Redis operations are atomic. [Related recommendations: Redis video tutorial]

What are the advantages of Redis?

  • Based on memory operation, memory reading and writing speed is fast.

  • Redis is single-threaded, which avoids thread switching overhead and multi-thread competition issues. Single thread means that network requests are processed by one thread, that is, one thread handles all network requests. Redis has more than one thread when running. For example, the process of data persistence will start a new thread.

  • Supports multiple data types, including String, Hash, List, Set, ZSet, etc.

  • Support persistence. Redis supports two persistence mechanisms, RDB and AOF. The persistence function can effectively avoid data loss problems.

  • Support transactions. All operations of Redis are atomic, and Redis also supports atomic execution after merging several operations.

  • Support master-slave replication. The master node will automatically synchronize data to the slave node, allowing read and write separation.

Why is Redis so fast?

  • Based on memory: Redis uses memory storage and has no disk IO overhead. The data is stored in the memory and the reading and writing speed is fast.
  • Single-threaded implementation (before Redis 6.0): Redis uses a single thread to process requests, avoiding the overhead of thread switching and lock resource contention between multiple threads.
  • IO multiplexing model: Redis uses IO multiplexing technology. Redis uses a single thread to poll descriptors, converting database operations into events, without wasting too much time on network I/O.
  • Efficient data structure: Redis has optimized the bottom layer of each data type in order to pursue faster speed.

Why does Redis choose single thread?

  • Avoid excessive context switching overhead. The program always runs in a single thread in the process, and there is no scenario of multi-thread switching.
  • Avoid the overhead of synchronization mechanism: If Redis chooses a multi-threaded model and needs to consider the issue of data synchronization, some synchronization mechanisms will inevitably be introduced, which will cause problems in the process of operating data. More overhead will increase program complexity and reduce performance.
  • Simple implementation and easy maintenance: If Redis uses multi-threading mode, then the design of all underlying data structures must consider thread safety issues, and the implementation of Redis will become more complicated. .

What are the application scenarios of Redis?

  • Cache hotspot data to relieve the pressure on the database.

  • Using the atomic auto-increment operation of Redis, you can realize the function of counter, such as counting the number of user likes, the number of user visits, etc.

  • Simple message queue You can use Redis's own publish/subscribe mode or List to implement a simple message queue and implement asynchronous operations.

  • Speed ​​limiter can be used to limit the frequency of a user accessing a certain interface. For example, the flash sale scenario is used to prevent users from unnecessary pressure caused by quick clicks. .

  • Friend relationship uses some commands of collection, such as intersection, union, difference set, etc., to realize functions such as common friends and common hobbies.

What is the difference between Memcached and Redis?

  • Redis only uses single core, while Memcached can use multiple cores.

  • MemCached has a single data structure and is only used to cache data, while Redis supports multiple data types.

  • MemCached does not support data persistence, and the data will disappear after restarting. Redis supports data persistence.

  • #Redis provides master-slave synchronization mechanism and cluster deployment capabilities, which can provide high-availability services. Memcached does not provide a native cluster mode and needs to rely on the client to write data into shards in the cluster.

  • Redis is much faster than Memcached.

  • Redis uses single-threaded multi-channel IO reuse model, and Memcached uses multi-threaded non-blocking IO model.

What are the data types of Redis?

Basic data types:

1, String: The most commonly used data type, String type values ​​can be strings or numbers Or binary, but the maximum value cannot exceed 512MB.

2, Hash: Hash is a collection of key-value pairs.

3, Set: An unordered set with deduplication. Set provides methods such as intersection and union, which is particularly convenient for realizing functions such as mutual friends and common attention.

4, List: An ordered and repeatable collection, the bottom layer is implemented by relying on a doubly linked list.

5, SortedSet(ZSet): Ordered Set. A score parameter is maintained internally for implementation. Suitable for scenarios such as rankings and weighted message queues.

Special data types:

1, Bitmap: bitmap, which can be considered as an array in units of bits. Each element in the array Each unit can only store 0 or 1. The subscript of the array is called the offset in Bitmap. The length of Bitmap has nothing to do with the number of elements in the collection, but is related to the upper limit of the cardinality.

2, Hyperloglog. HyperLogLog is an algorithm used for cardinality statistics. Its advantage is that when the number or volume of input elements is very large, the space required to calculate the cardinality is always fixed and very small. A typical usage scenario is counting unique visitors.

3, Geospatial: Mainly used to store geographical location information and operate on the stored information. Applicable scenarios such as positioning, nearby people, etc.

Redis Transaction

The principle of transaction is to send several commands within a transaction scope to Redis, and then let Redis execute these commands in sequence.

Transaction life cycle:

  • Use MULTI to open a transaction;

  • is opening a transaction , the command for each operation will be inserted into a queue, and the command will not actually be executed; the

  • EXEC command commits the transaction .

20 questions that must be mastered in Redis, come and collect them! !

An error in a command within a transaction scope will not affect the execution of other commands, and atomicity is not guaranteed:

first:0>MULTI
"OK"
first:0>set a 1
"QUEUED"
first:0>set b 2 3 4
"QUEUED"
first:0>set c 6
"QUEUED"
first:0>EXEC
1) "OK"
2) "OK"
3) "OK"
4) "ERR syntax error"
5) "OK"
6) "OK"
7) "OK"

WATCH command

WATCH command can monitor one or more keys. Once one of the keys is modified, subsequent transactions will not be executed (similar to optimistic locking) . After executing the EXEC command, monitoring will be automatically canceled.

first:0>watch name
"OK"
first:0>set name 1
"OK"
first:0>MULTI
"OK"
first:0>set name 2
"QUEUED"
first:0>set gender 1
"QUEUED"
first:0>EXEC
(nil)
first:0>get gender
(nil)

For example, in the above code:

  1. watch name turns on the monitoring of namekey
  2. Modify the value of name
  3. Open transaction a
  4. name and gender## are set in transaction a The value of
  5. #Use the
  6. EXEC command to submit the transaction
  7. Use the command
  8. get gender and find that it does not exist, that is, transaction a is not executed
Use

UNWATCH to cancel the monitoring of key by the WATCH command, and all monitoring locks will be cancelled.

Persistence mechanism

Persistence is to write

memory data to disk to prevent memory data loss caused by service downtime.

Redis supports two methods of persistence, one is the

RDB method, and the other is the AOF method. The former will regularly store the data in the memory on the hard disk according to the specified rules, while the latter will record the command after each execution of the command. Generally a combination of the two is used.

RDB method

RDB is the default persistence solution of Redis. When RDB is persisted, the data in the memory will be written to the disk, and a dump.rdb file will be generated in the specified directory. Redis restart will load the dump.rdb file to restore data.

bgsave is the mainstream way to trigger RDB persistence. The execution process is as follows:

20 questions that must be mastered in Redis, come and collect them! !

    Execution
  • BGSAVE Command
  • The Redis parent process determines whether there is currently an executing child process
  • . If it exists, the BGSAVE command returns directly. The parent process performs the
  • fork
  • operationCreates a child process. The parent process will be blocked during the fork operation. Parent process
  • fork
  • After completion, The parent process continues to receive and process the client's request, and The child process begins to write the data in the memory to the hard disk. Temporary file;When the child process has finished writing all the data, it will
  • replace the old RDB file
  • with this temporary file.
  • When Redis starts, it reads the RDB snapshot file and loads the data from the hard disk into the memory. Through RDB persistence, once Redis exits abnormally, data changed since the last persistence will be lost.

How to trigger RDB persistence:

  1. 手动触发:用户执行SAVEBGSAVE命令。SAVE命令执行快照的过程会阻塞所有客户端的请求,应避免在生产环境使用此命令。BGSAVE命令可以在后台异步进行快照操作,快照的同时服务器还可以继续响应客户端的请求,因此需要手动执行快照时推荐使用BGSAVE命令。

  2. 被动触发

    • 根据配置规则进行自动快照,如SAVE 100 10,100秒内至少有10个键被修改则进行快照。
    • 如果从节点执行全量复制操作,主节点会自动执行BGSAVE生成 RDB 文件并发送给从节点。
    • 默认情况下执行shutdown命令时,如果没有开启 AOF 持久化功能则自动执行·BGSAVE·。

优点

  • Redis 加载 RDB 恢复数据远远快于 AOF 的方式

  • 使用单独子进程来进行持久化,主进程不会进行任何 IO 操作,保证了 Redis 的高性能

缺点

  • RDB方式数据无法做到实时持久化。因为BGSAVE每次运行都要执行fork操作创建子进程,属于重量级操作,频繁执行成本比较高。

  • RDB 文件使用特定二进制格式保存,Redis 版本升级过程中有多个格式的 RDB 版本,存在老版本 Redis 无法兼容新版 RDB 格式的问题

AOF方式

AOF(append only file)持久化:以独立日志的方式记录每次写命令,Redis重启时会重新执行AOF文件中的命令达到恢复数据的目的。AOF的主要作用是解决了数据持久化的实时性,AOF 是Redis持久化的主流方式。

默认情况下Redis没有开启AOF方式的持久化,可以通过appendonly参数启用:appendonly yes。开启AOF方式持久化后每执行一条写命令,Redis就会将该命令写进aof_buf缓冲区,AOF缓冲区根据对应的策略向硬盘做同步操作。

默认情况下系统每30秒会执行一次同步操作。为了防止缓冲区数据丢失,可以在Redis写入AOF文件后主动要求系统将缓冲区数据同步到硬盘上。可以通过appendfsync参数设置同步的时机。

appendfsync always //每次写入aof文件都会执行同步,最安全最慢,不建议配置
appendfsync everysec  //既保证性能也保证安全,建议配置
appendfsync no //由操作系统决定何时进行同步操作

接下来看一下 AOF 持久化执行流程:

20 questions that must be mastered in Redis, come and collect them! !

  • 所有的写入命令会追加到 AOP 缓冲区中。

  • AOF 缓冲区根据对应的策略向硬盘同步。

  • 随着 AOF 文件越来越大,需要定期对 AOF 文件进行重写,达到压缩文件体积的目的。AOF文件重写是把Redis进程内的数据转化为写命令同步到新AOF文件的过程。

  • 当 Redis 服务器重启时,可以加载 AOF 文件进行数据恢复。

优点

  • AOF可以更好的保护数据不丢失,可以配置 AOF 每秒执行一次fsync操作,如果Redis进程挂掉,最多丢失1秒的数据。

  • AOF以append-only的模式写入,所以没有磁盘寻址的开销,写入性能非常高。

缺点

  • 对于同一份文件AOF文件比RDB数据快照要大。

  • 数据恢复比较慢。

主从复制

Redis的复制功能是支持多个数据库之间的数据同步。主数据库可以进行读写操作,当主数据库的数据发生变化时会自动将数据同步到从数据库。从数据库一般是只读的,它会接收主数据库同步过来的数据。一个主数据库可以有多个从数据库,而一个从数据库只能有一个主数据库。

//启动Redis实例作为主数据库
redis-server  
//启动另一个实例作为从数据库
redis-server --port 6380 --slaveof  127.0.0.1 6379   
slaveof 127.0.0.1 6379
//停止接收其他数据库的同步并转化为主数据库
SLAVEOF NO ONE

主从复制的原理?

  • 当启动一个从节点时,它会发送一个 PSYNC 命令给主节点;

  • 如果是从节点初次连接到主节点,那么会触发一次全量复制。此时主节点会启动一个后台线程,开始生成一份 RDB 快照文件;

  • 同时还会将从客户端 client 新收到的所有写命令缓存在内存中。RDB 文件生成完毕后, 主节点会将RDB文件发送给从节点,从节点会先将RDB文件写入本地磁盘,然后再从本地磁盘加载到内存中

  • Then the master node will send the write command cached in the memory to the slave node, and the slave node will synchronize the data;

  • If the slave node is If the network fails and the connection is disconnected, it will automatically reconnect. After the connection, the master node will only synchronize part of the missing data to the slave node.

Sentinel

Master-slave replication has problems such as automatic failover and high availability. Sentry Mode solves these problems. The sentinel mechanism can automatically switch between master and slave nodes.

When the client connects to Redis, it first connects to the Sentinel. The Sentinel will tell the client the address of the Redis master node, and then the client connects to Redis and performs subsequent operations. When the master node goes down, Sentinel detects that the master node is down, and will re-elect a slave node with good performance to become the new master node, and then notify other slave servers through the publish-subscribe mode to let them switch hosts.

20 questions that must be mastered in Redis, come and collect them! !

Working principle

  • Each Sentinel sends messages to it once per second Known Master, Slave and other Sentinel instances send a PING command.
  • If the time since the last valid reply to the PING command exceeds the specified value for an instance, the instance will be marked as subjectively offline by Sentine.
  • If a Master is marked as subjective offline, all Sentinel that is monitoring this Master must confirm at a frequency of once per second MasterWhether it has truly entered the subjective offline state.
  • When a sufficient number of Sentinel (greater than or equal to the value specified in the configuration file) confirms that Master has indeed entered the subjective offline state within the specified time range, thenMaster will be marked as objectively offline. If there are not enough Sentinel to agree that Master has been offline, Master's objective offline status will be lifted. If Master returns a valid reply to Sentinel's PING command again, Master's subjective offline status will be removed.
  • The sentinel node will elect a sentinel leader to be responsible for failover.
  • The sentinel leader will elect a slave node with good performance to become the new master node, and then notify other slave nodes to update the master node information.

Redis cluster

The sentinel mode solves the problem that the master-slave replication cannot automatically failover and cannot achieve high availability. However, the write ability and capacity of the master node are still limited. Single machine configuration problem. The cluster mode implements distributed storage of Redis, and each node stores different content, solving the problem that the writing ability and capacity of the master node are limited by the single-machine configuration.

The minimum configuration of the Redis cluster cluster node is more than 6 nodes (3 masters and 3 slaves). The master node provides read and write operations, and the slave node serves as a backup node, does not provide requests, and is only used for failover.

Redis cluster uses virtual slot partitioning. All keys are mapped to 0~16383 integer slots according to the hash function. Each node is responsible for maintaining a part of the slots and the key values ​​mapped by the slots. data.

20 questions that must be mastered in Redis, come and collect them! !

#How are hash slots mapped to Redis instances?

  • Use the crc16 algorithm to calculate a result for the key of the key-value pair

  • Take the remainder of the result to 16384, and the obtained value represents the hash slot corresponding to key

  • Locate the corresponding instance based on the slot information

Advantages:

  • No central architecture, supports dynamic expansion of capacity;
  • data follows slotStorage is distributed on multiple nodes, data is shared between nodes, Data distribution can be dynamically adjusted;
  • High availability. The cluster is still available when some nodes are unavailable. The cluster mode can realize automatic failover. The nodes exchange status information through the gossip protocol, and use the voting mechanism to complete the role transition from Slave to Master.

shortcoming:

  • Batch operation (pipeline) is not supported.
  • Data is replicated asynchronously, strong consistency of data is not guaranteed.
  • Transaction operation support is limited, only supports multiple key transaction operations on the same node, when multiple key are distributed on different nodes The transaction function cannot be used during this period.
  • key As the minimum granularity of data partitioning, a large key-value object such as hash, list, etc. cannot be mapped to different node.
  • Does not support multiple database spaces. Redis in stand-alone mode can support up to 16 databases, but only 1 database space can be used in cluster mode.

Deletion strategy for expired keys?

1, Passive deletion (lazy). When accessing the key, if it is found that the key has expired, the key will be deleted.

2, Actively delete (regularly). Clean keys regularly. Each cleanup will traverse all DBs in sequence, randomly take out 20 keys from the db, and delete them if they expire. If 5 keys expire, then continue to clean this db, otherwise start cleaning the next db.

3. Clean up when the memory is insufficient. Redis has a maximum memory limit. The maximum memory can be set through the maxmemory parameter. When the used memory exceeds the set maximum memory, memory must be released. When memory is released, the memory will be cleaned according to the configured elimination strategy.

What are the memory elimination strategies?

When the memory of Redis exceeds the maximum allowed memory, Redis will trigger the memory elimination strategy and delete some infrequently used data to ensure the normal operation of the Redis server.

Before Redisv4.0, 6 data elimination strategies were provided:

  • volatile-lru: LRU (Least Recently Used ),Recently Used. Use the LRU algorithm to remove keys with an expiration time set
  • allkeys-lru: When the memory is insufficient to accommodate newly written data, remove the least recently used key from the data set
  • volatile-ttl: Select the data that will expire from the data set that has set the expiration time to eliminate
  • volatile-random: Select the data that will expire from the data set that has the expiration time set Randomly select data to eliminate from the data set
  • allkeys-random: Randomly select data to eliminate from the data set
  • no-eviction: Prohibit deletion of data when When the memory is not enough to accommodate the newly written data, the new write operation will report an error

After Redisv4.0, the following two are added:

  • volatile-lfu: LFU, Least Frequently Used, least used, selects the least frequently used data from the data set with an expiration time and eliminates it.
  • allkeys-lfu: When memory is insufficient to accommodate newly written data, remove the least frequently used keys from the data set.

The memory elimination policy can be modified through the configuration file. The corresponding configuration item is maxmemory-policy, and the default configuration is noeviction .

How to ensure data consistency during double writing between cache and database?

1. Delete the cache first and then update the database.

When performing an update operation, delete the cache first and then update the database. When subsequent requests read again, the data will be read from the database. After reading, the new data is updated to the cache.

Existing problems: After deleting the cached data and before updating the database, if there is a new read request during this time period, the old data will be read from the database and rewritten to the cache, causing inconsistency again. And all subsequent reads are old data.

2. Update the database first and then delete the cache.

When performing an update operation, update MySQL first. After success, delete the cache and then add the new data to subsequent read requests. Write-back caching.

Existing problem: During the period between updating MySQL and deleting the cache, the request to read is still the cached old data. However, when the database update is completed, it will be consistent and the impact will be relatively small.

3. Asynchronous update cache

After the database update operation is completed, the cache is not directly operated, but the operation command is encapsulated into a message and thrown into the message queue, and then Redis itself consumes and updates the data, and the message queue can ensure the consistency of the data operation sequence and ensure that the data in the cache system is normal.

Cache penetration, cache avalanche, cache breakdown [detailed explanation] Redis cache breakdown, penetration, avalanche concepts and solutions

Cache penetration

Cache penetration refers to querying a non-existent data. Since the cache is passively written when there is a miss, if the data cannot be found from the DB, it will not be written to the cache. This will cause this non-existent data to be queried in the DB every time it is requested, losing the meaning of caching. When the traffic is heavy, the DB may hang.

  • Cache empty value, the database will not be checked.

  • 采用布隆过滤器,将所有可能存在的数据哈希到一个足够大的bitmap中,查询不存在的数据会被这个bitmap拦截掉,从而避免了对DB的查询压力。

布隆过滤器的原理:当一个元素被加入集合时,通过K个散列函数将这个元素映射成一个位数组中的K个点,把它们置为1。查询时,将元素通过散列函数映射之后会得到k个点,如果这些点有任何一个0,则被检元素一定不在,直接返回;如果都是1,则查询元素很可能存在,就会去查询Redis和数据库。

缓存雪崩

缓存雪崩是指在我们设置缓存时采用了相同的过期时间,导致缓存在某一时刻同时失效,请求全部转发到DB,DB瞬时压力过重挂掉。

解决方法:在原有的失效时间基础上增加一个随机值,使得过期时间分散一些。

缓存击穿

缓存击穿:大量的请求同时查询一个 key 时,此时这个 key 正好失效了,就会导致大量的请求都落到数据库。缓存击穿是查询缓存中失效的 key,而缓存穿透是查询不存在的 key。

解决方法:加分布式锁,第一个请求的线程可以拿到锁,拿到锁的线程查询到了数据之后设置缓存,其他的线程获取锁失败会等待50ms然后重新到缓存取数据,这样便可以避免大量的请求落到数据库。

public String get(String key) {
    String value = redis.get(key);
    if (value == null) { 
        //缓存值过期
        String unique_key = systemId + ":" + key;
        //设置30s的超时
        if (redis.set(unique_key, 1, 'NX', 'PX', 30000) == 1) {  //设置成功
            value = db.get(key);
            redis.set(key, value, expire_secs);
            redis.del(unique_key);
        } else {  
            //其他线程已经到数据库取值并回写到缓存了,可以重试获取缓存值
            sleep(50);
            get(key);  //重试
        }
    } else {
        return value;
    }
}

pipeline的作用?

redis客户端执行一条命令分4个过程:发送命令、命令排队、命令执行、返回结果。使用pipeline可以批量请求,批量返回结果,执行速度比逐条执行要快。

使用pipeline组装的命令个数不能太多,不然数据量过大,增加客户端的等待时间,还可能造成网络阻塞,可以将大量命令的拆分多个小的pipeline命令完成。

原生批命令(mset和mget)与pipeline对比:

  • 原生批命令是原子性,pipeline非原子性。pipeline命令中途异常退出,之前执行成功的命令不会回滚

  • 原生批命令只有一个命令,但pipeline支持多命令

LUA脚本

Redis 通过 LUA 脚本创建具有原子性的命令:当lua脚本命令正在运行的时候,不会有其他脚本或 Redis 命令被执行,实现组合命令的原子操作。

在Redis中执行Lua脚本有两种方法:evalevalshaeval命令使用内置的 Lua 解释器,对 Lua 脚本进行求值。

//第一个参数是lua脚本,第二个参数是键名参数个数,剩下的是键名参数和附加参数
> eval "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second
1) "key1"
2) "key2"
3) "first"
4) "second"

lua脚本作用

1、Lua脚本在Redis中是原子执行的,执行过程中间不会插入其他命令。

2、Lua脚本可以将多条命令一次性打包,有效地减少网络开销。

应用场景

举例:限制接口访问频率。

在Redis维护一个接口访问次数的键值对,key是接口名称,value是访问次数。每次访问接口时,会执行以下操作:

  • 通过aop拦截接口的请求,对接口请求进行计数,每次进来一个请求,相应的接口访问次数count加1,存入redis。
  • 如果是第一次请求,则会设置count=1,并设置过期时间。因为这里set()expire()组合操作不是原子操作,所以引入lua脚本,实现原子操作,避免并发访问问题。
  • 如果给定时间范围内超过最大访问次数,则会抛出异常。
private String buildLuaScript() {
    return "local c" +
        "\nc = redis.call('get',KEYS[1])" +
        "\nif c and tonumber(c) > tonumber(ARGV[1]) then" +
        "\nreturn c;" +
        "\nend" +
        "\nc = redis.call('incr',KEYS[1])" +
        "\nif tonumber(c) == 1 then" +
        "\nredis.call('expire',KEYS[1],ARGV[2])" +
        "\nend" +
        "\nreturn c;";
}

String luaScript = buildLuaScript();
RedisScript<Number> redisScript = new DefaultRedisScript<>(luaScript, Number.class);
Number count = redisTemplate.execute(redisScript, keys, limit.count(), limit.period());

PS:这种接口限流的实现方式比较简单,问题也比较多,一般不会使用,接口限流用的比较多的是令牌桶算法和漏桶算法。

更多编程相关知识,请访问:编程入门!!

The above is the detailed content of 20 questions that must be mastered in Redis, come and collect them! !. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete