This article introduces how to set the bandwidth of a Linux instance and troubleshoot whether the CPU is full or high, and focuses on the specific steps. The content of this article is compact, and I hope you can gain something from it.
Linux instance bandwidth and CPU are full or high troubleshooting
When using cloud server ECS, if the service slows down or the ECS instance suddenly disconnects, you can consider the server bandwidth and CPU Is there any issue with running full or high? If you create an alarm task in advance, the system will automatically issue an alarm reminder when the bandwidth and CPU are full or high. Under Linux system, you can follow the following steps to troubleshoot:
Locate the problem. Find the specific processes that affect bandwidth and CPU running full or high.
Analysis and processing. Check whether the processes that affect bandwidth and CPU are running full or high are normal, and classify them for processing.
For normal processes: You need to optimize the program or upgrade the server configuration.
For abnormal processes: You can manually check and kill the process, or you can use third-party security tools to check and kill the process.
The relevant configurations and instructions in this article have been tested on the CentOS 6.5 64-bit operating system. The configuration of other types and versions of operating systems may be different. For details, please refer to the official documentation of the corresponding operating system.
If the CPU of the cloud server ECS Linux system continues to run high, it will affect system stability and business operations. This article briefly explains the troubleshooting and analysis of the problem of high CPU usage.
Problem location of CPU running full or high
If the CPU of cloud server ECS continues to run high, it will affect the stability and stability of the system. business operations are affected. In Linux systems, the common commands to view processes are as follows:
ps -aux ps -ef top
In Linux systems, the top command is usually used to view system load problems and locate processes that consume more CPU resources.
Operation steps
#Connect to the ECS instance through the console management terminal, see Using the remote connection function to connect to the ECS instance.
Note: When the resource load is abnormal, remote connection through SSH is usually not possible. It is recommended that you connect through the console management terminal.
View the current running status of the system through the top command.
top - 17:27:13 up 27 days, 3:13, 1 user, load average: 0.02, 0.03, 0.05 Tasks: 94 total, 1 running, 93 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.3 us, 0.1 sy, 0.0 ni, 99.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.1 st KiB Mem: 1016656 total, 946628 used, 70028 free, 169536 buffers KiB Swap: 0 total, 0 used, 0 free. 448644 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 41412 3824 2308 S 0.0 0.4 0:19.01 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.04 kthreadd
For load issues, you only need to pay attention to the first and third lines of information echoed. The details are as follows.
The content displayed on the first line of the top command is 17:27:13 up 27 days, 3:13, 1 user, load average: 0.02, 0.03, 0.05, which are the current time of the system and the times the system has reached so far. The running time, the number of users currently logged into the system, and the system load are consistent with the query results of directly executing the uptime command.
The third line of the top command will display the current overall usage of CPU resources, and the resource usage of each process will be displayed below.
Use the letter key P to sort the CPU usage in reverse order, and then locate the process that occupies a higher CPU in the system.
Note: Through the letter key M, you can sort the system memory usage. If there is a multi-core CPU, numeric key 1 can display the load status of each core CPU.
You can view the program file corresponding to each process ID through ll /proc/PID/exe.
Analysis and processing of CPU running full or high
CPU running full or high, after confirming the specific process results, For abnormal processes, you need to terminate them through the top command; for problems such as insufficient memory caused by the kswapd0 process, you need to upgrade the system specifications or optimize the program.
Use top to directly terminate the process that consumes a large amount of CPU
You can quickly terminate the corresponding abnormal process directly in the top running interface. The steps are as follows:
If you want to kill a process, just press the lowercase k key.
Enter the PID of the process you want to terminate (the first column of the top output result). For example, if you want to terminate the process with PID 86, enter 86 and press Enter.
After the operation is successful, a prompt message similar to Send pid 86 signal [15/sigterm] will appear on the interface. Just press Enter to confirm.
kswapd0 process occupation leads to high CPU
The operating system uses the paging mechanism to manage physical memory, and the system will virtualize part of the hard disk space into memory usage. Since the speed of memory is much faster than that of disk, the system must follow a certain paging mechanism to swap unnecessary pages to disk and transfer required pages to memory.
kswapd0 is the process responsible for paging in virtual memory management. When the server memory is insufficient, kswapd0 will perform a paging operation. This paging operation consumes a lot of host CPU resources. The operation steps are as follows:
View the kswapd0 process through the top command.
检查该进程是否持续处于非睡眠状态,且运行时间较长。若是,可以初步判定系统在持续地进行换页操作,kswapd0 进程占用了系统大量 CPU 资源。
您可以通过 free 、ps 等指令进一步查询系统及系统内进程的内存占用情况,做进一步排查分析。
针对系统当前内存不足的问题,您可以重启 Apache,释放内存。
说明:从长远的角度来看,您需要对内存进行升级。
带宽跑满或跑高的分析处理
对于正常进程导致的带宽跑满或跑高的问题,需要对服务器的带宽进行升级。对于异常进程,有可能是由于恶意程序问题,或者是部分 IP 恶意访问导致,也可能是服务遭到了 CC 攻击。
通常情况下,您可以使用 iftop 工具或 nethogs 查看流量的占用情况,进而定位到具体的进程。
使用 iftop 工具排查
在服务器内部安装 iftop 流量监控工具。
yum install iftop -y
服务器外网带宽被占满时,如果通过远程无法登陆,可通过阿里云终端管理进入到服务器内部,运行下面命令查看流量占用情况:
iftop -i eth1 -P
注意:-P 参数将会显示请求端口。执行 iftop -i eth0 -P 命令,可以查看通过服务器哪个端口建立的连接,以及内网流量。举例如下:
在上图中,您可以查看到流量高耗的是服务器上 53139 端口和 115.205.150.235 地址建立的连接。
执行 netstat 命令反查 53139 端口对应的进程。
netstat -tunlp |grep 53139
经反查,服务器上 vsftpd 服务产生大量流量,您可以通过停止服务或使用 iptables 服务来对指定地址进行处理,如屏蔽 IP 地址或限速,以保证服务器带宽能够正常使用。
使用 nethogs 进行排查
在服务器内部安装 nethogs 流量监控工具。
yum install nethogs -y
通过 nethogs 工具来查看网卡上进程级的流量信息,若未安装可以通过 yum、apt-get 等方式安装。举例如下:
若 eth1 网卡跑满,执行命令 nethogs eth1。
查看每个进程的网络带宽情况以及进程对应的 PID。
确定导致带宽跑满或跑高的具体进程。
若进程确定是恶意程序,可以通过执行 kill -TERM
说明: 如果是 Web 服务程序,您可以使用 iftop 等工具来查询具体 IP 来源,然后分析 Web 访问日志是否为正常流量。日志分析可以使用 logwatch 或 awstats 等工具进行。
使用 Web 应用防火墙防御 CC 攻击
若您的服务遭受了 CC 攻击,请在 Web 应用防火墙控制台尽快开启 CC 安全防护。
登录 Web应用防火墙 控制台。
在 CC 安全防护中,启动状态按钮,并在模式中选择 正常。
The above is the detailed content of How to set Linux instance bandwidth and troubleshoot if the CPU is full or high. For more information, please follow other related articles on the PHP Chinese website!