linux运维 - linux系统top给出的信息都有哪些啊？

Question

我想知道cpu使用率说的的双核一起吗？多少算正常呢？load average一般什么值是正常的？

高洛峰 · Answer

An answer for a newbie, if you are scared:

QA: Does the cpu usage rate refer to dual cores together?

Reference materials: Beginner’s Guide: Detailed explanation of Linux Top commands

contains this sentence: Cpu(s)：表示这一行显示CPU总体信息

You can see in the picture below that the %CPU of the first chrome is 108.9, which is caused by me opening more than 30 windows just now, so it is the overall information of the CPU, that is, multi-core

How old is normal?

This should be measured based on different programs. Different programs have different CPU requirements

Sorry, if there are any mistakes, please correct me

天蓬老师 · Answer

Google first and ask questions later, you will gain greater rewards

伊谢尔伦 · Answer

Linux novices, I personally think that they should first understand the meaning of each item in the top command.
Don’t ask any search engine if you have any questions, take a look at man top first.

top - 16:12:56 up 1 day, 22 min,  4 users,  load average: 0.02, 0.04, 0.05
Tasks: 158 total,   1 running, 156 sleeping,   0 stopped,   1 zombie
%Cpu(s):  0.7 us,  0.3 sy,  0.0 ni, 98.8 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   1017912 total,   895892 used,   122020 free,    15312 buffers
KiB Swap:  1045500 total,    19608 used,  1025892 free.   230012 cached Mem

 PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
5761 eechen    20   0   32144   1548   1076 R  6.2  0.2   0:00.01 top

16:12:56 up 1 day, 22 min,  4 users,  load average: 0.02, 0.04, 0.05

This sentence is equivalent to the content returned by executing the uptime command.
16:12:56 is the current time (date).
up 1 day, 22 min means that the system has been running for 1 day and 22 minutes ( uptime -p).
4 users represents the users currently logged in to the system (w,who).
load average represents the system load, which are the load averages from 1 minute, 5 minutes, and 15 minutes ago to now.
Tasks: 158 total represents the number of processes in the system (the number is equal to the value of ps -ef|wc -l minus 2). Press capital H to switch to thread mode.
running represents a running process, sleeping represents a sleeping process, and stopped Indicates a suspended process, zombie indicates a zombie process that has ended but has not been deleted from the process table.
total indicates total memory, used indicates used memory, and free indicates free memory. Press E to switch units.
buffers (Buffer Cache) represents the memory occupied by the read and write buffer of the block device, cached (Page Cache) represents the memory occupied by the file system cache.
buffers: block device buffer cached: file system cache
If cached A large value indicates that there are many files in the cache. If frequently accessed files can be cached, then the read I/O of the disk will be very small.
The so-called block device refers to the access to its information. "Block" is the unit, such as usual optical disks, hard disks, floppy disks, tapes, etc. The block length is 512 bytes or 1024 bytes or 4096 bytes.
Block devices can be accessed directly through block device special files. In order to improve For data transmission efficiency, the block device driver uses block buffering technology internally.
Swap is the swap space, which is used when the physical memory (RAM) is full.
If the system needs more memory resources, and the physical When memory is full, inactive pages in memory will be moved to swap space.
Although swap space can help machines with small amounts of memory, this method should not be regarded as a replacement for memory.
Swap space is located on the hard drive and is slower than entering physical memory.

Understanding of load average:
Load average refers to the average number of processes (or threads) in the task_running or task_uninterruptible state.
The process (or thread) in the task_running state may be using the CPU or Queuing and waiting to use the CPU.
A process (or thread) in the task_uninterruptible state may be waiting for I/O, such as waiting for disk I/O. At this time, the percentage of CPU time occupied by I/O waiting iowait(wa) may be compared High.

sudo strace -p `pidof top` 可见top从/proc读取了很多信息.
man proc 查看 /proc/loadavg 的说明:
man proc | col -b > proc.txt
/proc/loadavg 内容:
0.22 0.13 0.14 2/374 5306

0.22 0.13 0.14 indicates the number of tasks that are running (task_running) or waiting for IO (task_uninterruptible) in the past 1 minute, 5 minutes, 15 minutes.
2 in 2/374 indicates currently running The number of threads, 374 represents the number of kernel scheduling entities (processes/threads) currently existing in the system.
5306 is the PID number of the process recently created by the system.

Another example:

load average: 31.09, 29.87, 29.92

means that in the past 1 minute, 5 minutes, and 15 minutes, there were an average of 30 programs (here should be 30 Java threads) using the CPU in the CPU task queue.

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 20248 root      20   0  0.227t 0.012t  18748 S  3090  5.2  29812:58 java

The CPU usage of the Java process %CPU reaches 3090%, which means that the Java process is using 31 CPU cores.
In this way, based on the data obtained by the load average above, there are about 30 Java threads. 30 CPU cores are being used.
Press H (case sensitive) to switch to thread mode. Because a thread can only use one core at most, the CPU usage displayed in thread mode will not exceed 100%.

When both the CPU and the disk are overwhelmed, there is no point in opening more processes. It will only increase CPU context switching and disk I/O waiting, which is not worth the loss.
The system load is usually high because It is caused by too many system processes and too much I/O.
Load average less than 1 means the system is idle, and greater than 1 means the system is getting busy.
The number of tasks (processes) of the Linux server should be kept below 200 for comparison. Okay, it’s best not to exceed 300.

us, user   : time running un-niced user processes 用户空间进程占用CPU时间百分比
sy, system : time running kernel processes 内核进程占用CPU时间百分比
ni, nice   : time running niced user processes 用户空间内改变过优先级的进程占用CPU时间百分比
id, idle   : time spent in the kernel idle handler 空闲CPU时间百分比(100%表示系统完全空闲)
wa, iowait : time waiting for I/O completion I/O等待占用的CPU时间百分比
hi : time spent servicing hardware interrupts 硬件中断占用CPU时间百分比
si : time spent servicing software interrupts 软件中断占用CPU时间百分比
st : time stolen from this vm by the hypervisor 虚拟化hypervisor从当前虚拟机vm偷走的时间
如果st这个值很高的话,说明你的VPS提供商的CPU资源有限,而你没能抢过别人,很有可能就是VPS提供商超售了.

Press F to select the columns to be displayed and view the meaning of each column. The default columns are:

PID     = Process Id          
USER    = Effective User Name 
PR      = Priority PR和NI的值越高越友好即越不竞争资源,比如PR 20和NI 0,另外,PR=NI+20.
NI      = Nice Value 负值表示高优先级,正值表示低优先级,比如kworker的NI为-20,PR为0.
VIRT    = Virtual Image (KiB) 
RES     = Resident Size (KiB) 常驻内存,按E切换单位.
SHR     = Shared Memory (KiB) 
S       = Process Status      
%CPU    = CPU Usage 四核处理器在Tasks模式下满载为400%,在Threads模式(按H切换)下满载为100%(一个线程最多只能使用一个核心).按Shift+P按CPU使用率排序.
%MEM    = Memory Usage (RES) 满载为100%,按Shift+M按RES内存排序.
TIME+   = CPU Time, hundredths 进程使用的CPU时间总计.比如2:32.45代表2分钟32.45秒.
COMMAND = Command Name/Line

Press F to enter the domain management window and press A to switch the display mode. Press space to select the column to be displayed. Press S to sort by the specified column. Use the right arrow key to select the column to adjust the order.
After modification, press Shift+W saves settings to ~/.toprc file.

Press Shift+M in top to sort by memory, press E to switch memory units, and press Shift+W to save the settings.
Then execute top -n1 -b to see the information of all processes sorted by memory.
Or use sort after ps:

ps aux | sort -k4nr | head -n5

Press C in top or use the -c parameter to see the absolute path and startup parameters of the process, and you can get information similar to that provided by ps -ef and ps aux.

看进程路径: top -p `pidof firefox` -c -n1 
看进程线程: top -p `pidof firefox` -H -n1

Linux Process Status:
http://blog.csdn.net/tianlesoftware/article/details/6457487

R (task_running): executable state
S (task_interruptible): interruptible sleep state
D (task_uninterruptible): uninterruptible sleep state
T (task_stopped or task_traced): suspended state or Tracking status
Z (task_dead - exit_zombie): Exit status, the process becomes a zombie process
X (task_dead - exit_dead): Exit status, the process is about to be destroyed

Running Process:
Only processes in this state may run on the CPU.
There may be multiple processes in the executable state at the same time. The task_struct structures (process control blocks) of these processes are put into the executable queue of the corresponding CPU (a process can only appear in the executable queue of one CPU at most) middle).
The task of the process scheduler is to select a process from the executable queue of each CPU to run on that CPU.
Many operating system textbooks define processes that are executing on the CPU as the RUNNING state, and processes that are executable but have not yet been scheduled for execution as the READY state. These two states are unified into the TASK_RUNNING state under Linux.

Sleeping process:
A process in this state is suspended because it is waiting for a certain event to occur (such as waiting for a socket connection, waiting for a semaphore).
The task_struct structure of these processes is put into the waiting queue of the corresponding event. When these events occur (triggered by external interrupts or triggered by other processes), one or more processes in the corresponding waiting queue will be awakened.
Through the ps command, we will see that under normal circumstances, the vast majority of processes in the process list are in the task_interruptible state (unless the load on the machine is very high).
After all, there are only one or two CPUs, and there are dozens or hundreds of processes. If not most of the processes are sleeping, how can the CPU respond?

stopped process:
Send a sigstop signal to the process, and it will enter the task_stopped state in response to the signal, unless the process itself is in the task_uninterruptible state and does not respond to the signal.
Sigstop, like the sigkill signal, is very mandatory. The user process is not allowed to reset the corresponding signal processing function through the signal series of system calls.
Sending a sigcont signal to the process can restore it from the task_stopped state to the task_running state.
When a process is being traced, it is in the special state of task_traced. "Being tracked" means that the process is paused and waiting for the process that is tracking it to operate on it.
For example, if you set a breakpoint on a tracked process in gdb, the process will be in the task_traced state when it stops at the breakpoint. At other times, the tracked process is still in the states mentioned earlier.
For the process itself, the task_stopped and task_traced states are very similar, both indicating that the process is suspended.
The task_traced state is equivalent to an additional layer of protection on top of task_stopped. The process in the task_traced state cannot be awakened in response to the sigcont signal.
The debugged process can only return to the task_running state until the debugging process performs operations such as ptrace_cont and ptrace_detach through the ptrace system call (operations specified through the parameters of the ptrace system call), or the debugging process exits.

zombie process:
Among the status of Linux processes, zombie process is a very special kind. It is a process that has ended, but has not been deleted from the process table.
Too much will cause the entries in the process table to become full, causing the system to crash, but it does not occupy other system resources.
It has given up almost all memory space, does not have any executable code, and cannot be scheduled.
It only reserves a position in the process list to record the exit status and other information of the process for other processes to collect. In addition, In addition, zombie processes no longer occupy any memory space.
The process is in the TASK_DEAD state during the exit process. During this exit process, all resources occupied by the process will be recycled, except for the task_struct structure (and a few resources).
So the process is left with only the empty shell of task_struct, so it is called a zombie.
The reason why task_struct is retained is because task_struct stores the exit code of the process and some statistical information.
The parent process is likely to care about this information. For example, in the shell, the $? variable stores the exit code of the last foreground process that exited, and this exit code is often used as the judgment condition of the if statement.
Of course, the kernel can also save this information elsewhere and release the task_struct structure to save some space.
But it is more convenient to use the task_struct structure, because the search relationship from pid to task_struct has been established in the kernel, as well as the parent-child relationship between processes.
To release the task_struct, you need to create some new data structures so that the parent process can find the exit information of its child process.
When the child process exits, the kernel will send a signal to its parent process to notify the parent process to "collect the corpse".
The parent process can wait for the exit of one or some child processes through the wait series of system calls (such as wait4, waitid) and obtain its exit information.
Then the wait series of system calls will also release the corpse of the child process (task_struct).
This signal defaults to SIGCHLD, but this signal can be set when creating a child process through the clone system call.
If its parent process does not install the SIGCHLD signal processing function and calls wait or waitpid() to wait for the child process to end, and does not explicitly ignore the signal, then it will remain in the zombie state, and the body of the child process (task_struct) will also be Cannot be released.
If the parent process ends at this time, the init process will automatically take over the child process and collect its corpse, and it can still be cleared.
But if the parent process is a loop and will not end, then the child process will remain in a zombie state. This is why there are sometimes many zombie processes in the system.
When a process exits, all its child processes will be hosted by other processes (making them child processes of other processes).
The managed process may be the next process in the process group of the exiting process (if it exists), or it may be process number 1.
So every process and every moment has a parent process. Unless it is process number 1. Process No. 1, the process with pid 1, is also called the init process.

After the Linux system starts, the first user mode process created is the init process. It has two missions:
1. Execute the system initialization script and create a series of processes (all of which are descendants of the init process);
2. Wait for the exit event of its child process in an infinite loop, and The waitid system call is called to complete the "corpse collection" work;
the init process will not be suspended or killed (this is guaranteed by the kernel). It is in the task_interruptible state while waiting for the child process to exit, and it is in the task_running state during the "collection" process.

阿神 · Answer

Don’t use top, try htop.