Almost all Java-related interviews will ask questions about caching. The more basic ones will ask what is the "80/20 rule" and what is "hot data and cold data". The more complicated ones will ask about cache avalanche, cache penetration, cache preheating, cache update, cache downgrade, etc. These seemingly uncommon concepts are all related to our cache server. Commonly used cache servers include Redis and Memcached. etc., and the author is currently using only Redis.
If you have not encountered an interviewer asking you during previous interviews, "Why is Redis said to be single-threaded and why is Redis so fast!" 》, then when you read this article, you should feel that it is a very lucky thing! If you happen to be a high-quality interviewer, you can also use this question to interview the friend opposite you who is "seeing through the water" to test his mastery.
Okay! Get to the point! Let’s first discuss what Redis is, why Redis is so fast, and then discuss why Redis is single-threaded? [Related recommendations:Redis video tutorial]
1. Introduction to Redis
Redis is an open source in-memory data structure storage system. Can be used as: database, cache and message middleware.
It supports multiple types of data structures, such as String, Hash, List, Set, Sorted Set or ZSet and range Queries, Bitmaps, Hyperloglogs and geospatial index radius queries. Among them, the five common data structure types are: String, List, Set, Hash, and ZSet.
Redis has built-in replication (Replication), LUA scripting (Lua scripting), LRU driver events (LRU eviction), transactions (Transactions) and different levels of disk persistence (Persistence), and through Redis Sentinel (Sentinel) ) and automatic partitioning (Cluster) provide high availability (High Availability).
Redis also provides persistence options, which allow users to save their data to disk for storage. Depending on the actual situation, the data set can be exported to disk (snapshot) at certain intervals, or appended to the command log (AOF only appends files). When executing the write command, it will copy the executed write command to the hard disk. . You can also turn off persistence and use Redis as an efficient network cache data function.
Redis does not use tables, and its database does not predefine or force users to associate different data stored in Redis.
The working mode of the database can be divided into: hard disk database and memory database according to the storage method. Redis stores data in memory. When reading and writing data, it is not limited by the I/O speed of the hard disk, so it is extremely fast.
(1) Working mode of hard disk database:
(2) Working mode of memory database:
After reading the above description, do you have some understanding of some common Redis-related interview questions, such as: what is Redis, what are the common data structure types of Redis, how does Redis persist, etc.
2. How fast is Redis?
Redis uses a memory-based KV database that uses a single-process single-thread model. It is written in C language. The official data provided can reach 100,000 QPS (number of queries per second) .
This data is no worse than the same memory-based KV database Memcached that uses single process and multi-threading!
The horizontal axis is the number of connections, and the vertical axis is QPS. At this time, this picture reflects an order of magnitude. I hope you can describe it correctly during the interview. Don't ask me if the order of magnitude of your answer is very different!
3. Why is Redis so fast?
1. Completely based on memory, most requests are pure memory operations, very fast. The data is stored in memory, similar to HashMap. The advantage of HashMap is that the time complexity of search and operation is O(1);
2. The data structure is simple, and the data operation is also simple. The data structure in Redis It is specially designed;
3. It uses a single thread to avoid unnecessary context switching and competition conditions. There is no switching caused by multi-process or multi-threading to consume the CPU, and there is no need to consider various locks. There is no problem of locking and releasing locks, and there is no performance consumption caused by possible deadlocks;
4. Use multi-channel I/O multiplexing model, non-blocking IO;
5. The underlying models used are different, the underlying implementation methods and the application protocols for communication with the client are different. Redis directly builds the VM mechanism by itself, because if the general system calls system functions, it will waste a certain amount of time. Moves and Requests;
The above points are relatively easy to understand. Below we will briefly discuss the multi-channel I/O multiplexing model:
(1) Multi-channel I/O multiplexing model
The multi-channel I/O multiplexing model uses select, poll, and epoll to monitor the I/O events of multiple streams at the same time. When idle, the current thread will be blocked. When one or more streams have I/O /O event, it wakes up from the blocking state, so the program will poll all the streams (epoll only polls those streams that actually emitted the event), and only processes the ready streams in sequence. This approach This avoids a lot of useless operations.
Here "multi-channel" refers to multiple network connections, and "reuse" refers to reusing the same thread.
The use of multi-channel I/O multiplexing technology allows a single thread to efficiently handle multiple connection requests (minimizing the time consumption of network IO), and Redis operates data in memory very quickly. , that is to say, in-memory operations will not become a bottleneck affecting Redis performance. The above points mainly contribute to Redis's high throughput.
4. So why is Redis single-threaded?
We must first understand that the above analyzes are all to create an atmosphere where Redis is fast! The official FAQ states that because Redis is a memory-based operation, the CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of the machine memory or the network bandwidth. Since single-threading is easy to implement and the CPU will not become a bottleneck, it is logical to adopt a single-threaded solution (after all, using multi-threading will cause a lot of trouble!).
You may cry when you see this! I thought there would be some major technical points that make Redis so fast using a single thread, but I didn't expect an official answer that seemed to fool us! However, we can already clearly explain why Redis is so fast, and precisely because it is already fast in single-threaded mode, there is no need to use multi-threading!
However, our single-threaded approach cannot take advantage of multi-core CPU performance, but we can improve it by opening multiple Redis instances on a single machine!
Warning 1: The single thread we have been emphasizing here only has one thread to process our network requests. A formal Redis Server must have more than one thread when running. Here we need Please pay attention clearly! For example, when Redis is persisted, it will be executed as a sub-process or sub-thread (the specific sub-thread or sub-process needs to be studied in depth by the reader); for example, I view the Redis process on the test server, and then find the thread under the process:
The "-T" parameter of the ps command indicates Show threads, possibly with SPID column. The "SID" column indicates the thread ID, and the "CMD" column displays thread name.
Warning 2: The last paragraph in the FAQ in the picture above states that multi-threading will be supported starting from Redis version 4.0. However, multi-threading operations are only performed on certain operations! Therefore, whether this article will still be single-threaded in future versions requires readers to verify!
5. Notes
1. We know that Redis uses the "single-threaded-multiplexed IO model" to implement high-performance memory data services. This mechanism avoids the use of locks, but at the same time, this mechanism will reduce the concurrency of redis when executing time-consuming commands such as sunion.
Because it is a single thread, there is only one operation in progress at the same time. Therefore, time-consuming commands will lead to a decrease in concurrency, not only read concurrency, but also write concurrency. A single thread can only use one CPU core, so multiple instances can be started in the same multi-core server to form a master-master or master-slave. Time-consuming read commands can be completely performed on the slave.
Redis.conf items that need to be changed:
pidfile /var/run/redis/redis_6377.pid #pidfile should be added with the port number
port 6377 #This is required Changed
logfile /var/log/redis/redis_6377.log #The name of the logfile is also added with the port number
dbfilename dump_6377.rdb #rdbfile is also added with the port number
2. "We cannot let the operating system load balance, because we know our own programs better, so we can manually allocate CPU cores to them without occupying the CPU too much, or letting our key processes and A bunch of other processes are crowded together."
CPU is an important influencing factor. Since it is a single-threaded model, Redis prefers a large cache and fast CPU rather than multiple cores.
On multi-core CPU servers, Redis performance also depends on NUMA configuration and processor binding position. The most obvious impact is that redis-benchmark uses CPU cores randomly. To get accurate results, you need to use fixed processor tools (on Linux you can use taskset). The most effective way is to separate the client and server into two different CPUs to use third-level cache.
6. Expansion
The following are also several models you should know. I wish you a successful interview!
1. Single-process multi-thread model: MySQL, Memcached, Oracle (Windows version);
2. Multi-process model: Oracle (Linux version);
3. Nginx has two types of processes, one is called the Master process (equivalent to the management process), and the other is called the Worker process ( actual work process). There are two startup methods:
(1) Single process startup: At this time, there is only one process in the system, which plays the role of both the Master process and the Worker process.
(2) Multi-process startup: At this time, the system has one and only one Master process, and at least one Worker process is working.
(3) The Master process mainly performs some global initialization work and management of Worker; event processing is performed in the Worker.
For more programming-related knowledge, please visit:Introduction to Programming! !
The above is the detailed content of Why is Redis single-threaded and why is it so fast?. For more information, please follow other related articles on the PHP Chinese website!