Redis memory model (detailed explanation)

Redis is currently one of the most popular in-memory databases. By reading and writing data in memory, it greatly improves the reading and writing speed. It can be said that Redis is an indispensable part of achieving high concurrency on the website. [Recommended learning: Redis video tutorial]

When we use Redis, we will come into contact with the five object types of Redis (string, hash, list , collections, ordered collections), rich types are a major advantage of Redis over Memcached, etc. On the basis of understanding the usage and characteristics of the five object types of Redis, further understanding of the memory model of Redis will be of great help to the use of Redis, for example:

1. Estimating the memory usage of Redis. So far, the cost of using memory is still relatively high, and memory cannot be used without scruples. Reasonably evaluating the memory usage of Redis based on the needs and choosing the appropriate machine configuration can save costs while meeting the needs.

2. Optimize memory usage. Understanding the Redis memory model allows you to choose more appropriate data types and encodings and make better use of Redis memory.

3. Analyze and solve problems. When problems such as blocking and memory usage occur in Redis, the cause of the problem should be discovered as soon as possible to facilitate analysis and solution.

This article mainly introduces the memory model of Redis (taking 3.0 as an example), including the memory occupied by Redis and how to query it, the encoding methods of different object types in memory, the memory allocator (jemalloc), Simple dynamic string (SDS), RedisObject, etc.; and then introduce the applications of several Redis memory models on this basis.

1. Redis memory statistics

If you want to do your job well, you must first sharpen your tools. Before explaining Redis memory, first explain how to count the memory usage of Redis.

After the client connects to the server through redis-cli (if there is no special instructions later, the client will always use redis-cli), you can check the memory usage through the info command:

info memory

Among them, the info command can display a lot of information about the redis server, including basic server information, CPU, memory, persistence, client connection information, etc.; memory is a parameter, indicating that only memory-related information will be displayed.

Some of the more important instructions in the returned results are as follows:

(1)used_memory:The total amount of memory allocated by the Redis allocator (unit is bytes), including the virtual memory used (i.e. swap); the Redis allocator will be introduced later. used_memory_human just appears more friendly.

(2)used_memory_rss: The Redis process occupies the memory of the operating system (unit is bytes), which is consistent with the value seen by the top and ps commands; In addition to the memory allocated by the allocator, used_memory_rss also includes the memory required for the process to run itself, memory fragments, etc., but does not include virtual memory.

Therefore, used_memory and used_memory_rss, the former is the amount obtained from the perspective of Redis, and the latter is the amount obtained from the perspective of the operating system. The reason why the two are different is that on the one hand, memory fragmentation and the memory required to run the Redis process make the former may be smaller than the latter. On the other hand, the existence of virtual memory makes the former may be larger than the latter.

Since in actual applications, the amount of data in Redis will be relatively large, the memory occupied by the process running at this time will be much smaller than the amount of Redis data and memory fragments; therefore, the ratio of used_memory_rss and used_memory is It has become a parameter to measure the Redis memory fragmentation rate; this parameter is mem_fragmentation_ratio.

(3)mem_fragmentation_ratio:Memory fragmentation ratio, this value is the ratio of used_memory_rss/used_memory.

mem_fragmentation_ratio is generally greater than 1, and the greater the value, the greater the memory fragmentation ratio. mem_fragmentation_ratio

Generally speaking, mem_fragmentation_ratio is in a relatively healthy state around 1.03 (for jemalloc); the mem_fragmentation_ratio value in the screenshot above is very large because the data has not been stored in Redis and the Redis process itself is running The memory makes used_memory_rss much larger than used_memory.

(4)mem_allocator: The memory allocator used by Redis is specified at compile time; it can be libc, jemalloc or tcmalloc, and the default is jemalloc; used in the screenshot The default is jemalloc.

2. Redis memory division

Redis is an in-memory database, and the content stored in the memory is mainly data (key-value pairs); from the previous description, we can know that in addition to data, Redis Other parts also take up memory.

The memory usage of Redis can be mainly divided into the following parts:

1. Data

As a database, data is the most important part; the memory occupied by this part will Statistics are in used_memory.

Redis uses key-value pairs to store data, and the values ​​(objects) include 5 types, namely strings, hashes, lists, sets, and ordered sets. These 5 types are provided by Redis to the outside world. In fact, within Redis, each type may have 2 or more internal encoding implementations; in addition, when Redis stores objects, it does not directly throw the data into the memory, but Objects will be packaged in various ways: such as redisObject, SDS, etc.; this article will focus on the details of data storage in Redis later.

2. The memory required to run the process itself

The Redis main process itself definitely requires memory to run, such as code, constant pool, etc.; this part of memory is about a few megabytes, and in most production environments Compared with the memory occupied by Redis data, it can be ignored. This part of memory is not allocated by jemalloc, so it will not be counted in used_memory.

Supplementary Note: In addition to the main process, the running of sub-processes created by Redis will also occupy memory, such as the sub-processes created when Redis performs AOF and RDB rewriting. Of course, this part of memory does not belong to the Redis process and will not be counted in used_memory and used_memory_rss.

3. Buffer memory

Buffer memory includes client buffer, copy backlog buffer, AOF buffer, etc.; among them, the client buffer stores the input and output buffer of the client connection; the copy backlog The buffer is used for part of the copy function; the AOF buffer is used to save the latest write command during AOF rewriting. Before understanding the corresponding functions, you do not need to know the details of these buffers; this part of memory is allocated by jemalloc, so it will be counted in used_memory.

4. Memory fragmentation

Memory fragmentation is generated by Redis during the process of allocating and recycling physical memory. For example, if the data is changed frequently and the size of the data is very different, the space released by redis may not be released in the physical memory, but redis cannot effectively use it, resulting in memory fragmentation. Memory fragmentation will not be counted in used_memory.

The generation of memory fragmentation is related to the operation of data, the characteristics of the data, etc.; in addition, it is also related to the memory allocator used: if the memory allocator is reasonably designed, the occurrence of memory fragmentation can be reduced as much as possible produce. jemalloc, which will be discussed later, does a good job of controlling memory fragmentation.

If the memory fragmentation in the Redis server is already large, you can reduce the memory fragmentation by safely restarting: because after restarting, Redis re-reads the data from the backup file and rearranges it in the memory. Re-select the appropriate memory unit for each data to reduce memory fragmentation.

3. Details of Redis data storage

1. Overview

The details of Redis data storage involve memory allocators (such as jemalloc), simple dynamic strings ( SDS), 5 object types and internal encoding, redisObject. Before describing the specific content, let me first explain the relationship between these concepts.

The following figure is the data model involved when executing set hello world.

Image source: https://searchdatabase.techtarget.com.cn/7-20218/

(1) dictEntry: Redis is Key-Value Database, so there will be a dictEntry for each key-value pair, which stores pointers to Key and Value; next points to the next dictEntry, which has nothing to do with this Key-Value.

(2) Key: As can be seen in the upper right corner of the figure, Key ("hello") is not stored directly as a string, but is stored in the SDS structure.

(3) redisObject: Value("world") is neither stored directly as a string nor directly stored in SDS like Key, but is stored in redisObject. In fact, no matter which of the five types of Value it is, it is stored through redisObject; the type field in redisObject indicates the type of the Value object, and the ptr field points to the address of the object. However, it can be seen that although the string object is packaged by redisObject, it still needs to be stored through SDS.

In fact, in addition to the type and ptr fields, redisObject also has other fields not shown in the diagram, such as fields used to specify the internal encoding of the object; these will be introduced in detail later.

(4) jemalloc: Whether it is a DictEntry object, redisObject, or SDS object, a memory allocator (such as jemalloc) is required to allocate memory for storage. Taking the DictEntry object as an example, it consists of 3 pointers and occupies 24 bytes on a 64-bit machine. jemalloc will allocate a 32-byte memory unit for it.

The following introduces jemalloc, redisObject, SDS, object types and internal encoding respectively.

2. jemalloc

Redis will specify a memory allocator when compiling; the memory allocator can be libc, jemalloc or tcmalloc, and the default is jemalloc.

jemalloc, as the default memory allocator of Redis, does a relatively good job of reducing memory fragmentation. In 64-bit systems, jemalloc divides the memory space into three ranges: small, large, and huge; each range is divided into many small memory block units; when Redis stores data, it will select the memory block with the most appropriate size. storage.

The memory units divided by jemalloc are as shown below:








typedef struct redisObject {
  unsigned type:4;
  unsigned encoding:4;
  unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */
  int refcount;
  void *ptr;
} robj;



type字段表示对象的类型,占4个比特;目前包括REDIS_STRING(字符串)、REDIS_LIST (列表)、REDIS_HASH(哈希)、REDIS_SET(集合)、REDIS_ZSET(有序集合)。





通过object encoding命令,可以查看对象采用的编码方式,如下图所示:




通过对比lru时间与当前时间,可以计算某个对象的空转时间;object idletime命令可以显示该空转时间(单位是秒)。object idletime命令的一个特殊之处在于它不改变对象的lru值。

lru值除了通过object idletime命令打印之外,还与Redis的内存回收有关系:如果Redis打开了maxmemory选项,且内存回收算法选择的是volatile-lru或allkeys—lru,那么当Redis内存占用超过maxmemory指定的值时,Redis会优先选择空转时间最长的对象进行释放。









共享对象的引用次数可以通过object refcount命令查看,如下图所示。命令执行的结果页佐证了只有0~9999之间的整数会作为共享对象。


ptr指针指向具体的数据,如前面的例子中,set hello world,ptr指向包含字符串world的SDS。





Redis没有直接使用C字符串(即以空字符’\0’结尾的字符数组)作为默认的字符串表示,而是使用了SDS。SDS是简单动态字符串(Simple Dynamic String)的缩写。



struct sdshdr { 
    int len;
    int free;
    char buf[];



通过SDS的结构可以看出,buf数组的长度=free+len+1(其中1表示字符串结尾的空字符);所以,一个SDS结构占据的空间为:free所占长度+len所占长度+ buf数组的长度=4+4+free+len+1=free+len+9。



  • 获取字符串长度:SDS是O(1),C字符串是O(n)
  • 缓冲区溢出:使用C字符串的API时,如果字符串长度增加(如strcat操作)而忘记重新分配内存,很容易造成缓冲区的溢出;而SDS由于记录了长度,相应的API在可能造成缓冲区溢出时会自动重新分配内存,杜绝了缓冲区溢出。
  • 修改字符串时内存的重分配:对于C字符串,如果要修改字符串,必须要重新分配内存(先释放再申请),因为如果没有重新分配,字符串长度增大时会造成内存缓冲区溢出,字符串长度减小时会造成内存泄露。而对于SDS,由于可以记录len和free,因此解除了字符串长度和空间数组长度之间的关联,可以在此基础上进行优化:空间预分配策略(即分配内存时比实际需要的多)使得字符串长度增大时重新分配内存的概率大大减小;惰性空间释放策略使得字符串长度减小时重新分配内存的概率大大减小。
  • 存取二进制数据:SDS可以,C字符串不可以。因为C字符串以空字符作为字符串结束的标识,而对于一些二进制文件(如图片等),内容可能包括空字符串,因此C字符串无法正确存取;而SDS以字符串长度len来作为字符串结束标识,因此没有这个问题。



Redis在存储对象时,一律使用SDS代替C字符串。例如set hello world命令,hello和world都是以SDS的形式存储的。而sadd myset member1 member2 member3命令,不论是键(”myset”),还是集合中的元素(”member1”、 ”member2”和”member3”),都是以SDS的形式存储。除了存储对象,SDS还用于存储各种缓冲区。





Picture source: "Redis Design and Implementation"

Regarding the conversion of Redis internal encoding, it is in line with the following rules: Encoding conversion is in RedisIt is completed when writing data, and the conversion process is irreversible. It can only be converted from small memory encoding to large memory encoding.

1. String

(1) Overview

String is the most basic type, because all keys are string type, and string Several other complex types of elements are also strings.

The string length cannot exceed 512MB.

(2) Internal encoding

There are three types of internal encoding for string types. Their application scenarios are as follows:

  • int: 8 bytes long Integer type. When the string value is an integer, the value is represented by a long integer.
  • embstr: <=39-byte string. Both embstr and raw use redisObject and sds to save data. The difference is that embstr only allocates memory space once (so redisObject and sds are continuous), while raw needs to allocate memory space twice (allocate space for redisObject and sds respectively). Therefore, compared with raw, the advantage of embstr is that it allocates space once less when creating, releases space once less when deleting, and all the data of the object are connected together, making it easy to find. The disadvantages of embstr are also obvious. If the length of the string increases and memory needs to be reallocated, the entire redisObject and sds need to be reallocated. Therefore, embstr in redis is implemented as read-only.
  • raw: A string larger than 39 bytes

The example is as shown below:

embstr and raw The length of the distinction is 39; this is because the length of redisObject is 16 bytes and the length of sds is 9 string length; therefore when the string length is 39, the length of embstr is exactly 16 9 39=64, which jemalloc can just allocate 64-byte memory unit.

(3) Encoding conversion

When the int data is no longer an integer, or the size exceeds the range of long, it is automatically converted to raw.

As for embstr, since its implementation is read-only, when the embstr object is modified, it will be converted to raw first and then modified. Therefore, as long as the embstr object is modified, the modified object must be raw, regardless of whether it reaches 39 bytes. An example is shown in the figure below:

2. List

(1) Overview

List (list) is used to store multiple An ordered string, each string is called an element; a list can store 2^32-1 elements. The list in Redis supports insertion and popping at both ends, and can obtain elements at a specified position (or range), and can function as an array, queue, stack, etc.

(2) Internal encoding

The internal encoding of the list can be a compressed list (ziplist) or a double-ended linked list (linkedlist).

Double-ended linked list: It consists of a list structure and multiple listNode structures; the typical structure is as shown below:

Picture source: "Redis Design and Implementation》

As can be seen from the figure, the double-ended linked list saves both the head pointer and the tail pointer, and each node has pointers pointing forward and pointing back; the length of the list is saved in the linked list ; dup, free, and match set type-specific functions for node values, so linked lists can be used to store values ​​of various different types. Each node in the linked list points to a redisObject whose type is a string.

Compressed list: Compressed list was developed by Redis to save memory. It is composed of a series of specially encoded continuous memory blocks (rather than like a double-ended linked list Each node is a sequential data structure composed of pointers); the specific structure is relatively complicated and will be omitted. Compared with double-ended linked lists, compressed lists can save memory space, but the complexity is higher when modifying or adding or deleting operations; therefore, when the number of nodes is small, compressed lists can be used; but when the number of nodes is large, double-ended linked lists are still used Good deal.

Compressed lists are not only used to implement lists, but also to implement hashes and ordered lists; they are very widely used.

(3) Encoding conversion

Compressed list will be used only when the following two conditions are met at the same time: the number of elements in the list is less than 512; all string objects in the list are less than 64 characters Festival. If one condition is not met, a double-ended list is used; and the encoding can only be converted from a compressed list to a double-ended linked list, and the reverse direction is not possible.

The following figure shows the characteristics of list encoding conversion:
















typedef struct dictEntry{
    void *key;
        void *val;
    struct dictEntry *next;


  • key:键值对中的键;
  • val:键值对中的值,使用union(即共用体)实现,存储的内容既可能是一个指向值的指针,也可能是64位整型,或无符号64位整型;
  • next:指向下一个dictEntry,用于解决哈希冲突问题






typedef struct dictht{
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;


  • table属性是一个指针,指向bucket;
  • size属性记录了哈希表的大小,即bucket的大小;
  • used记录了已使用的dictEntry的数量;
  • sizemask属性的值总是为size-1,这个属性和哈希值一起决定一个键在table中存储的位置。




typedef struct dict{
    dictType *type;
    void *privdata;
    dictht ht[2];
    int trehashidx;
} dict;
















typedef struct intset{
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;





























因此,可以估算出这90000个键值对占据的内存大小为:90000*80 + 131072*8 = 8248576。


public class RedisTest {

  public static Jedis jedis = new Jedis("localhost", 6379);

  public static void main(String[] args) throws Exception{
    Long m1 = Long.valueOf(getMemory());
    Long m2 = Long.valueOf(getMemory());
    System.out.println(m2 - m1);

  public static void insertData(){
    for(int i = 10000; i < 100000; i++){
      jedis.set("aa" + i, "aa" + i); //key和value长度都是7字节,且不是整数

  public static String getMemory(){
    String memoryAllLine = jedis.info("memory");
    String usedMemoryLine = memoryAllLine.split("\r\n")[1];
    String memory = usedMemoryLine.substring(usedMemoryLine.indexOf(&#39;:&#39;) + 1);
    return memory;



作为对比将key和value的长度由7字节增加到8字节,则对应的SDS变为17个字节,jemalloc会分配32个字节,因此每个dictEntry占用的字节数也由80字节变为112字节。此时估算这90000个键值对占据内存大小为:90000*112 + 131072*8 = 11128576。


public static void insertData(){
  for(int i = 10000; i < 100000; i++){
    jedis.set("aaa" + i, "aaa" + i); //key和value长度都是8字节,且不是整数

















内存碎片率是一个重要的参数,对redis 内存的优化有重要意义。




The above is the detailed content of Redis memory model (detailed explanation). For more information, please follow other related articles on the PHP Chinese website!

