Home > Article > Operation and Maintenance > Detailed explanation of virtual memory management

Detailed explanation of virtual memory management

PHP中文网Original: 2017-06-20 11:23:263371browse

Modern operating systems generally adopt a virtual memory management (Virtual Memory Management) mechanism, which requires support from the MMU (Memory Management Unit) in the processor. First, the concepts of PA and VA are introduced.

1.PA (Physical Address)---Physical Address

If the processor does not have an MMU, or there is an MMU but it is not enabled, the memory address sent by the CPU execution unit will be directly transferred to the chip engine. pin, it is received by the memory chip (hereinafter referred to as physical memory to distinguish it from virtual memory), which is called PA (Physical Address, hereinafter referred to as PA), as shown in the figure below.

Detailed explanation of virtual memory management

Physical Address

2.VA (Virtual Address)---Virtual Address

If the processor has MMU enabled, the CPU executes The memory address sent by the unit will be intercepted by the MMU. The address from the CPU to the MMU is called a virtual address (hereinafter referred to as VA), and the MMU translates this address into another address and sends it to the external address pin of the CPU chip. That is, VA is mapped to PA, as shown in the figure below.

Virtual address

If it is a 32-bit processor, the internal address bus is 32-bit and is connected to the CPU execution unit (only 4 address lines are schematically drawn in the figure), while the external address bus after MMU conversion is not It must be 32-bit. In other words, Virtual address space and physical address space are independent. The virtual address space of a 32-bit processor is 4GB, while the physical address space can be larger or smaller than 4GB.

The MMU maps VA to PA in units of pages. The page size of a 32-bit processor is usually 4KB. For example, the MMU can map a page of 0xb7001000~0xb7001fff in VA to a page of 0x2000~0x2fff in PA through a mapping item. If the CPU execution unit wants to access the virtual address 0xb7001008, the actual physical address accessed is 0x2008. Pages in physical memory are called physical pages or page frames. Which page of virtual memory is mapped to which page frame of physical memory is described by the page table (Page Table). The page table is stored in the physical memory. The MMU will look up the page table to determine what PA a VA should be mapped to.

3. Process address space

Process address space

The virtual address space of the x86 platform is 0x0000 0000~0xffff ffff. Generally speaking, the first 3GB (0x0000 0000~0xbfff ffff) is user space, and the last 1GB (0xc000 0000~0xffff ffff) is kernel space.

Text Segmest and Data Segment

Text Segment, including .text segment, .rodata segment, .plt segment, etc. It is loaded into memory from /bin/bash, and the access permission is r-x.
Data Segment, including .data segment, .bss segment, etc. It is also loaded into memory from /bin/bash, and the access permission is rw-.

Heap and stack

Heap (heap): The heap is simply the remaining space in the computer memory, malloc Function dynamically allocated memory is allocated here. When dynamically allocating memory, the heap space can grow toward higher addresses. The upper limit of the address of the heap space is called Break. To increase the heap space to a higher address, the Break must be raised to map the new virtual memory page to the physical memory. This is achieved through the system call brk. The malloc function also calls brk to request allocation from the kernel. memory.
Stack (stack) : The stack is a specific memory area, in which the high-address part stores the environment variables and command line parameters of the process, and the low-address part stores the environment variables and command line parameters of the process. Partially save function stack frames, the stack space grows toward lower addresses, but obviously there is not as much room for growth as the heap space, because it is not uncommon for actual applications to dynamically allocate large amounts of memory, but there are dozens of layers deep It is very rare for function calls to have many local variables at each level of the call.

If you do not pay attention to memory allocation when writing a program, the following problems may occur in the heap and stack:

Memory leak: If you apply for a space in the heap through malloc in a function, and declare a pointer variable on the stack to save it, then when the function ends, the member variables of the function will be released, including this pointer variable, then this space cannot be found and cannot be released. Over time, it may cause the following memory leak problems.
Stack overflow: If you put too much data on the stack (such as large structures and arrays), it may cause "stack overflow" ( Stack Overflow) question, the program will also terminate. To avoid this problem, malloc should be used to apply for heap space when declaring such variables.
Wild pointer and Segmentation fault: If the space pointed to by a pointer has been released, and then try to use the pointer to access the already The freed space will cause "Segment Fault" problems. At this time, the pointer has become a wild pointer, and the wild pointer should be manually cleared in time.

4. The role of virtual memory management

Virtual memory management can control the access rights of physical memory. The physical memory itself does not restrict access, and any address can be read and written. However, the operating system requires different pages to have different access rights. This is achieved by using the memory protection mechanism of CPU mode and MMU.
The main function of virtual memory management is to allow each process to have an independent address space. The so-called independent address space means that the same VA in different processes is mapped to different PA by MMU, and it is impossible to access the data of another process by accessing any address in a certain process. In this way This ensures that any illegal memory access caused by the execution of wrong instructions or malicious code by any process will not accidentally rewrite the data of other processes and will not affect the operation of other processes, thereby ensuring the stability of the entire system. On the other hand, each process thinks that it owns the entire virtual address space exclusively, so it is easier to implement the linker and loader without having to consider whether the address ranges of each process conflict.

Detailed explanation of virtual memory management是独立的

The process address space is independent

The VA to PA mapping will allocate and release memory Bringing convenience, several blocks of memory with discontinuous physical addresses can be mapped into a block of memory with continuous virtual addresses. For example, if you want to use malloc to allocate a large memory space, although there is enough free physical memory, there is not enough continuous free memory. In this case, you can allocate multiple discontinuous physical pages and map them to a continuous virtual address range. .

Discontinuous PA can be mapped to continuous VA

If a system is running many processes at the same time, the sum of the memory allocated to each process may be greater than the actual available physical memory. Virtual memory management allows each process to still run normally in this case. Because each process is only allocated virtual memory pages, the data of these pages can be mapped to physical pages, or it can be temporarily saved to the disk without occupying the physical page. The temporary storage of virtual memory pages on the disk may be a disk partition. , or it may be a disk file, called a swap device. When the physical memory is not enough, the data in some infrequently used physical pages are temporarily saved to the swap device. Then the physical page is considered free and can be reallocated to the process. This process is called Page out. If the process needs to use the page that was swapped out, it will be loaded back into the physical memory from the swap device. This is called swapping in (Page in). Swapping out and swapping in operations are collectively called paging, so:
$The total amount of memory that can be allocated in the system = The size of physical memory + The size of the swap device$

As shown below. The first picture is swapping out, saving the data in the physical page to disk, unmapping the address, and releasing the physical page. The second picture is swapping in, allocating a free physical page, loading the disk temporary page back into memory, and establishing an address mapping.

Page change

5.malloc and free

The C standard library function malloc can dynamically allocate memory in the heap space. Its underlying Apply for memory from the operating system through the brk system call. After the dynamically allocated memory is used up, it can be released with free, or more precisely, returned to malloc, so that the memory can be allocated again the next time malloc is called.

1 #include 2 void *malloc(size_t size);  //返回值：成功返回所分配内存空间的首地址，出错返回NULL3 void free(void *ptr);

malloc的参数size表示要分配的字节数，如果分配失败（可能是由于系统内存耗尽）则返回NULL。由于malloc函数不知道用户拿到这块内存要存放什么类型的数据，所以返回通用指针void *，用户程序可以转换成其它类型的指针再访问这块内存。malloc函数保证它返回的指针所指向的地址满足系统的对齐要求，例如在32位平台上返回的指针一定对齐到4字节边界，以保证用户程序把它转换成任何类型的指针都能用。

动态分配的内存用完之后可以用free释放掉，传给free的参数正是先前malloc返回的内存块首地址。

示例

举例如下：

 1 #include  2 #include  3 #include  4 typedef struct { 5     int number; 6     char *msg; 7 } unit_t; 8 int main(void) 9 {10     unit_t *p = malloc(sizeof(unit_t));11     if (p == NULL) {12         printf("out of memory\n");13         exit(1);14     }15     p->number = 3;16     p->msg = malloc(20);17     strcpy(p->msg, "Hello world!");18     printf("number: %d\nmsg: %s\n", p->number, p->msg);19     free(p->msg);20     free(p);21     p = NULL;22     return 0;23 }

说明

unit_t *p = malloc(sizeof(unit_t));这一句，等号右边是void *类型，等号左边是unit_t *类型，编译器会做隐式类型转换，我们讲过void *类型和任何指针类型之间可以相互隐式转换。
虽然内存耗尽是很不常见的错误，但写程序要规范，malloc之后应该判断是否成功。以后要学习的大部分系统函数都有成功的返回值和失败的返回值，每次调用系统函数都应该判断是否成功。
free(p);之后，p所指的内存空间是归还了，但是p的值并没有变，因为从free的函数接口来看根本就没法改变p的值，p现在指向的内存空间已经不属于用户，换句话说，p成了野指针，为避免出现野指针，我们应该在free(p);之后手动置p = NULL;。
应该先free(p->msg)，再free(p)。如果先free(p)，p成了野指针，就不能再通过p->msg访问内存了。

6.内存泄漏

　　如果一个程序长年累月运行（例如网络服务器程序），并且在循环或递归中调用malloc分配内存，则必须有free与之配对，分配一次就要释放一次，否则每次循环都分配内存，分配完了又不释放，就会慢慢耗尽系统内存，这种错误称为内存泄漏（Memory Leak）。另外，malloc返回的指针一定要保存好，只有把它传给free才能释放这块内存，如果这个指针丢失了，就没有办法free这块内存了，也会造成内存泄漏。例如：

1 void foo(void)2 {3     char *p = malloc(10);4     ...5 }

When the foo function returns, it needs to release the memory space of the local variable p. The memory address it points to is lost, and these 10 bytes cannot be released. Memory leak bugs are difficult to find because they do not cause program running errors like out-of-bounds access. A small amount of memory leaks do not affect the correct operation of the program. A large number of memory leaks will cause a shortage of system memory, leading to frequent page changes, which not only affects the current process, and slow down the entire system.

There are some special cases regarding malloc and free. The call to malloc(0) is also legal and will return a non-NULL pointer. This pointer can also be passed to free for release, but the memory cannot be accessed through this pointer. free(NULL) is also legal and does not do anything, but freeing a wild pointer is illegal. For example, calling malloc first to return a pointer p, and then calling free(p); twice in succession, the latter call will generate Runtime error.

The above is the detailed content of Detailed explanation of virtual memory management. For more information, please follow other related articles on the PHP Chinese website!

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：Usage of chattr commandNext article：Usage of chattr command

See more