Process environment--process management-Linux Operation and Maintenance-php.cn

When the program is executed, how the main function is called, how the command line parameters are passed to the program, what the typical storage space layout looks like, how to allocate other storage space, how the process uses environment variables, the termination of the process, etc. These are the basics of process control.

Startup and termination of the process

When the kernel executes a c program, it uses the exec function to call a special startup routine, which is in the kernel Get command line parameters and environment variable values.

Process termination situations

5 normal termination situations:

（1）从main函数返回；
（2）调用exit；
（3）调用_exit和_Exit函数;
（4）最后一个线程调用pthread_exit;
（5）最后一个线程从其启动例程返回；

Copy after login

3 abnormal termination situations

（1）调用abort；
（2）接到一个信号；
（3）最后一个线程对取消请求做出响应；

Copy after login

Process startup and termination diagram

Process environment--process management

atexit function

A process can register up to 32 functions (for example: signal function), which are automatically called by the exit function. These functions are called when the program terminates to form a termination handler to perform finishing work before ending the process. The exit function uses the registration record of the atexit function to determine which functions to call.

exit function

This function is defined by ISO C and its operations include processing the termination handler and then closing all standard I/O streams. It should be noted that it does not handle file descriptors, multi-process (parent-child process) and job control.

_e(E)xit function
ISO C The purpose of defining this function is to provide a method for the process to terminate the program without running a termination handler or signal handling function. But whether ISO C flushes the standard I/O stream depends on the implementation of the operating system. In unix, flushing is not performed.

Status codes of exit and _e(E)ixt functions

No matter how the process ends, it will execute the same code on the kernel (as can be seen from the process startup and exit diagram). This code closes all file descriptors and releases all storage space.

After the program exits, use the exit code to inform the parent process of the process. The parent process uses the wait or waitpid function to complete the aftermath of the child process (obtain information about the child process and release the resources occupied by the child process). If the parent process does not handle the exit status of the child process, the child process becomes a zombie process. On the contrary, if the parent process terminates before the child process, the child process becomes orphan process. The orphan process will be received by process No. 1 (init process). The general process is as follows:

（1）进程终止时，内核逐个检查所有活动的进程；
（2）分析查找该终止进程的子进程；
（3）将该进程的子进程的父进程ID改为1；

Copy after login

wait and waitpid functions

When the program terminates normally or abnormally, the kernel will send a SIGNAL signal to the parent process. The termination of the child process is an asynchronous event, so this signal is also an asynchronous signal. This signal is generally ignored by the parent process by default. Or provide a signal processing function to deal with the aftermath. The wait and waitpid functions are part of the signal processing function.

The difference between wait and waitpid functions is as follows:

（1）wait会阻塞调用者进程等待直至第一个终止的子进程到来；
（2）waitpid可以通过参数设置，来实现调用者进程不阻塞，或选择要阻
塞等待的子进程；

Copy after login

The caller here refers to the parent process

Environment table and environment variables

Environment table structure diagram

Process environment--process management

Each program receives an environment table
Environment The table is also an array of character pointers
enrivon is called the environment pointer
The array of pointers is called the environment table
The string pointed to by each pointer is called the environment string

Environment variable

The unix kernel does not check the environment string, and their interpretation is completely Depends on each application process
Usually set environment variables in a shell startup file to control the actions of the shell
When modifying or adding environment variables , can only affect the environment of the current process and any child processes generated and called subsequently (previously not allowed), but cannot affect the environment of its parent process

Functions related to environment variables As follows:

#include<stdlib.h>
char *getenv(const char *name);
      返回值：指向与name关联的value的指针；若未找到，返回NULL

int putenv(char *str);
                       返回值：若成功，返回0；若出错，返回非0
                       
int setenv(const char *name, const char *value,
            int rewrite);
int unsetenv(const char *name);
                两个函数返回值：若成功，返回0；若出错，返回-1</stdlib.h>

Copy after login

How these functions modify the environment table

The environment table and environment string are usually stored at the high address (top) of the memory space. Therefore, when modifying its value, the memory cannot continue to extend to high addresses; but because there are stack frames below it, it cannot extend downwards. The process of how to modify its value is as follows:

(1) Modify the environment table

1）新value = 旧value,调用malloc函数，在堆区开辟新的存储空间，
将新value复制到这里，再将这片存储区首地址写到环境表相应的位置处。

Copy after login

(2) Add a new environment table

1）新增一个环境变量，调用malloc函数开辟新的存储空间，将原来的环
境表复制到该存储区，其次再添加一个环境变量，然后在尾部赋值为NULL，
最后将environ指向该区域；
2）在 1）过程的基础上，调用realloc函数，多次添加环境变量；

Copy after login

Note:Environment variables modified in this way are only valid when the current program is running. When the program ends, the corresponding storage area is recycled by the system, and these modifications will become invalid.

Supplementary explanation of memory storage structure

Memory management structure diagram

Process environment--process management

##Uninitialized data segment ( block started by symbol): Before the program starts executing the line, the kernel initializes the data in this segment to 0 or a null pointer;
Stack: Every time a function is called, its return address and the caller's environment information (such as the value of some machine registers) are stored on the stack;
共享库：只需在所有进程都可引用的存储区中保存这种库例程的一个副本；

存储空间分配函数

#include<stdlib.h>
void *malloc(size_t size);
void *calloc(size_t nojy, size_t size);
void *realloc(void *ptr, size_t newsize);
         3个函数返回值：若成功，返回非空指针；若出错，返回NULL</stdlib.h>

Copy after login

malloc函数：初始值不确定；底层通过调用sbrk函数实现；
calloc函数：初始值为0；
realloc函数：增加或减少以前分配区的长度；当增加长度时，可能将以前分配区的内容移到另一个足够大的区域，以便在分配区末尾增加存储区，而新增存储区初始值不确定(例如：可变数组的使用)；

注意：这些动态分配的函数一般在分配存储空间时，会比要求的大。因为在开辟空间的前后部分存储记录管理信息。因此，在使用时，千万不要越界访问，以免造成不可预知的后果。

函数间跳转策略

在c语言中，goto语句是不能跨函数跳转的。尤其是在函数深层调用时的跳转需求，在出错处理的情况下非常有用。

#include<setjmp.h>
int setjmp(jmp_buf env);
          返回值：若直接调用，返回0；若从longjmp返回，返回非0
void longjmp(jmp_buf env, int val);</setjmp.h>

Copy after login

变量值回滚问题：自动变量和寄存器变量会存在回滚现象。利用volatile属性来避免此类情况的发生。(在给变量赋值时，赋的值回首先存储在内存(存储器变量)中，然后在由cpu取走，存储在cpu的寄存器上(寄存器变量)。在做系统优化时，那些频繁使用的变量，会直接存储到寄存器中而不经过内存。)

寄存器变量会存在回滚现象的探究

在调用setjmp函数时，内核会把当前的栈顶指针保存在env变量中，所以在调用longjmp函数返回该位置时，全局变量、静态变量、易失变量和自动变量如果在调用setjmp和longjmp函数之间它们的值被修改过，是不会回滚到setjmp函数调用之前的值(当然，编译器将auto变量优化为寄存器变量除外)。因为，这些存储器变量的值是存储在内存相应的段中，回到原先栈顶状态时，同样访问的还是原先的内存空间。

然而，对于寄存器变量来说，首先要明确一点：寄存器变量是用动态存储的方式。意思是寄存器变量的值可能存在不同的寄存器中。如果在调setjmp和longjmp函数之间它们的值被修改过，这个值可能不会存到setjmp之前的对其赋值的寄存器中，而在调用longjmp函数后，又回到了调用setjmp函数时的状态。这个时候再读取寄存器变量的值时，读到的是原先那个寄存器中存储的值而不是修改过的那个寄存器中存储的值，所以出现的回滚现象。

The above is the detailed content of Process environment--process management. For more information, please follow other related articles on the PHP Chinese website!