Linux system and kernel parameter optimization under high concurrency conditions-Linux Operation and Maintenance-php.cn

It is well known that Linux does not support high concurrency well under the default parameters, which is mainly limited by the maximum number of open files in a single process, kernel TCP parameters and IO event distribution mechanism, etc. The following will adjust the Linux system from several aspects to support a high-concurrency environment.

Iptables related

If not necessary, turn off or uninstall the iptables firewall and prevent the kernel from loading the iptables module. These modules can affect concurrency performance.

Limit on the maximum number of files opened by a single process

General distributions limit the maximum number of files a single process can open to 1024, which is far from meeting high concurrency requirements. , the adjustment process is as follows: Type at the # prompt:

# ulimit–n 65535

Copy after login

Set the maximum number of files that can be opened by a single process started by root to 65535. If the system echoes something like "Operation not permitted", it means that the above limit modification failed. In fact, the specified value exceeds the Linux system's soft limit or hard limit on the number of files opened by the user. Therefore, it is necessary to modify the soft and hard limits of the Linux system on the number of open files for users.

The first step is to modify the limits.conf file and add:

# vim /etc/security/limits.conf * softnofile 65536 * hard nofile65536

Copy after login

The '*' sign indicates modifying the limits for all users; soft or hard specifies whether to modify the soft limit or the hard limit; 65536 specifies the new limit value that you want to modify, that is, the maximum number of open files (please note that the soft limit value must be less than or equal to the hard limit). Save the file after making changes. The second step is to modify the /etc/pam.d/login file and add the following line to the file:

# vim /etc/pam.d/login sessionrequired /lib/security/pam_limits.so

Copy after login

This tells Linux that after the user completes the system login, it should call the pam_limits.so module to set the system limits. The maximum limit on the number of various resources that the user can use (including the limit on the maximum number of files that the user can open), and the pam_limits.so module will read the configuration from the /etc/security/limits.conf file to set these limit values . Save this file after modification.

第三步，查看Linux系统级的最大打开文件数限制，使用如下命令：

# cat/proc/sys/fs/file-max 32568

Copy after login

这表明这台Linux系统最多允许同时打开(即包含所有用户打开文件数总和)32568个文件，是Linux系统级硬限制，所有用户级的打开文件数限制都不应超过这个数值。通常这个系统级硬限制是Linux系统在启动时根据系统硬件资源状况计算出来的最佳的最大同时打开文件数限制，如果没有特殊需要，不应该修改此限制，除非想为用户级打开文件数限制设置超过此限制的值。修改此硬限制的方法是修改/etc/sysctl.conf文件内fs.file-max= 131072

这是让Linux在启动完成后强行将系统级打开文件数硬限制设置为131072。修改完后保存此文件。

完成上述步骤后重启系统，一般情况下就可以将Linux系统对指定用户的单一进程允许同时打开的最大文件数限制设为指定的数值。如果重启后用ulimit-n命令查看用户可打开文件数限制仍然低于上述步骤中设置的最大值，这可能是因为在用户登录脚本/etc/profile中使用ulimit-n命令已经将用户可同时打开的文件数做了限制。

由于通过ulimit-n修改系统对用户可同时打开文件的最大数限制时，新修改的值只能小于或等于上次ulimit-n设置的值，因此想用此命令增大这个限制值是不可能的。所以，如果有上述问题存在，就只能去打开/etc/profile脚本文件，在文件中查找是否使用了ulimit-n限制了用户可同时打开的最大文件数量，如果找到，则删除这行命令，或者将其设置的值改为合适的值，然后保存文件，用户退出并重新登录系统即可。

通过上述步骤，就为支持高并发TCP连接处理的通讯处理程序解除关于打开文件数量方面的系统限制。

内核TCP参数方面

Linux系统下，TCP连接断开后，会以TIME_WAIT状态保留一定的时间，然后才会释放端口。当并发请求过多的时候，就会产生大量的TIME_WAIT状态的连接，无法及时断开的话，会占用大量的端口资源和服务器资源。这个时候我们可以优化TCP的内核参数，来及时将TIME_WAIT状态的端口清理掉。

下面介绍的方法只对拥有大量TIME_WAIT状态的连接导致系统资源消耗有效，如果不是这种情况下，效果可能不明显。可以使用netstat命令去查TIME_WAIT状态的连接状态，输入下面的组合命令，查看当前TCP连接的状态和对应的连接数量：

# netstat-n | awk ‘/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}’

Copy after login

这个命令会输出类似下面的结果：

LAST_ACK16 SYN_RECV348 ESTABLISHED70 FIN_WAIT1229 FIN_WAIT230 CLOSING33 TIME_WAIT18098

Copy after login

我们只用关心TIME_WAIT的个数，在这里可以看到，有18000多个TIME_WAIT，这样就占用了18000多个端口。要知道端口的数量只有65535个，占用一个少一个，会严重的影响到后继的新连接。这种情况下，我们就有必要调整下Linux的TCP内核参数，让系统更快的释放TIME_WAIT连接。

编辑配置文件:/etc/sysctl.conf，在这个文件中，加入下面的几行内容：

# vim /etc/sysctl.conf net.ipv4.tcp_syncookies= 1 net.ipv4.tcp_tw_reuse= 1 net.ipv4.tcp_tw_recycle= 1 net.ipv4.tcp_fin_timeout= 30

Copy after login

输入下面的命令，让内核参数生效：

# sysctl-p

Copy after login

简单的说明上面的参数的含义：

net.ipv4.tcp_syncookies= 1 表示开启SYNCookies。当出现SYN等待队列溢出时，启用cookies来处理，可防范少量SYN攻击，默认为0，表示关闭；
net.ipv4.tcp_tw_reuse= 1 表示开启重用。允许将TIME-WAITsockets重新用于新的TCP连接，默认为0，表示关闭；
net.ipv4.tcp_tw_recycle= 1 表示开启TCP连接中TIME-WAITsockets的快速回收，默认为0，表示关闭；
net.ipv4.tcp_fin_timeout 修改系統默认的TIMEOUT 时间。

在经过这样的调整之后，除了会进一步提升服务器的负载能力之外，还能够防御小流量程度的DoS、CC和SYN攻击。

此外，如果你的连接数本身就很多，我们可以再优化一下TCP的可使用端口范围，进一步提升服务器的并发能力。依然是往上面的参数文件中，加入下面这些配置：

net.ipv4.tcp_keepalive_time= 1200 net.ipv4.ip_local_port_range= 1024 65535 net.ipv4.tcp_max_syn_backlog= 8192 net.ipv4.tcp_max_tw_buckets= 5000

Copy after login

这几个参数，建议只在流量非常大的服务器上开启，会有显著的效果。一般的流量小的服务器上，没有必要去设置这几个参数。

net.ipv4.tcp_keepalive_time= 1200 Indicates the frequency of TCP sending keepalive messages when keepalive is enabled. The default is 2 hours, change it to 20 minutes.
ip_local_port_range= 1024 65535 Indicates the port range used for outbound connections. The default is very small, changed to 1024 to 65535.
net.ipv4.tcp_max_syn_backlog= 8192 indicates the length of the SYN queue. The default is 1024. Increasing the queue length to 8192 can accommodate more network connections waiting for connections.
net.ipv4.tcp_max_tw_buckets= 5000 means that the system maintains the maximum number of TIME_WAIT at the same time. If this number is exceeded, TIME_WAIT will be cleared immediately and a warning message will be printed. The default is 180000, change it to 5000. This parameter can control the maximum number of TIME_WAIT, as long as it is exceeded. Description of other kernel TCP parameters
net.ipv4.tcp_max_syn_backlog= 65536 The maximum value of recorded connection requests that have not yet received client confirmation information. For systems with 128M of memory, the default value is 1024, and for systems with small memory, it is 128.
net.core.netdev_max_backlog= 32768 The maximum number of packets allowed to be queued when each network interface receives packets faster than the kernel can process them.
net.core.somaxconn= 32768 For example, the backlog of the listen function in a web application will limit the net.core.somaxconn of our kernel parameters to 128 by default, while the NGX_LISTEN_BACKLOG defined by nginx defaults to 128. 511, so it is necessary to adjust this value.
net.core.wmem_default= 8388608
net.core.rmem_default= 8388608
net.core.rmem_max= 16777216 #Maximum socket read buffer, reference optimization value: 873200
##net.core.wmem_max= 16777216 #Maximum socket write buffer, reference optimization value :873200
net.ipv4.tcp_timestsmps= 0 timestamps to avoid sequence number wrapping. A 1Gbps link will definitely encounter sequence numbers that have been used before. The timestamp allows the kernel to accept such "abnormal" packets. It needs to be turned off here.
net.ipv4.tcp_synack_retries= 2 In order to open a connection to the peer, the kernel needs to send a SYN with an ACK in response to the previous SYN. This is the second handshake in the so-called three-way handshake. This setting determines the number of SYN ACK packets sent before the kernel abandons the connection.
net.ipv4.tcp_syn_retries= 2 The number of SYN packets sent before the kernel gives up establishing the connection.
#net.ipv4.tcp_tw_len= 1
net.ipv4.tcp_tw_reuse= 1 Enable reuse. Allows TIME-WAITsockets to be reused for new TCP connections.
net.ipv4.tcp_wmem= 8192 436600 873200 TCP write buffer, reference optimization value: 8192 436600 873200
net.ipv4. tcp_rmem = 32768 436600 873200 TCP read buffer, reference optimization value: 32768 436600 873200

net.ipv4.tcp_mem[2]: Above this value, TCP refuses to allocate socket. The memory units mentioned above are pages, not bytes. The reference optimization value is: 7864321048576 1572864

##net.ipv4.tcp_max_orphans= 3276800 The maximum number of TCP sockets in the system that are not associated with any user file handle. If this number is exceeded, the connection will be reset immediately and a warning message will be printed. This limit is only to prevent simple DoS attacks. You cannot rely too much on it or artificially reduce this value. You should increase this value (if you increase the memory).

net.ipv4.tcp_fin_timeout= 30 If the socket is requested to be closed by the local end, this parameter determines how long it remains in the FIN-WAIT-2 state. The peer can make errors and never close the connection, or even crash unexpectedly. The default value is 60 seconds. The usual value for 2.2 kernel is 180 seconds. You can press this setting, but remember that even if your machine is a lightly loaded WEB server, there is a risk of memory overflow due to a large number of dead sockets. FIN- WAIT-2 is less dangerous than FIN-WAIT-1 because it can only eat up to 1.5K of memory, but their survival period is longer.

It also involves a TCP congestion algorithm issue. You can use the following command to view the congestion algorithm control module provided by this machine:

sysctl net.ipv4.tcp_available_congestion_control

For the analysis of several algorithms, please refer to the following for details: Advantages and disadvantages of TCP congestion control algorithm, applicable environment, performance analysis, for example, if you have high latency, you can try hybla , for medium delays, you can try the htcp algorithm, etc.

If you want to set the TCP congestion algorithm to hybla net.ipv4.tcp_congestion_control=hybla

In addition, for kernel versions higher than 3.7.1, we can enable tcp_fastopen: net.ipv4. tcp_fastopen= 3

IO event distribution mechanism

To enable high-concurrency TCP connections in Linux, you must confirm whether the application uses appropriate network I/O technology and I/O OEvent dispatch mechanism. Available I/O technologies are synchronous I/O, non-blocking synchronous I/O, and asynchronous I/O. In the case of high TCP concurrency, if synchronous I/O is used, this will seriously block the operation of the program unless a thread is created for each TCP connection I/O. However, too many threads will cause huge overhead due to the system's thread scheduling. Therefore, it is not advisable to use synchronous I/O in high TCP concurrency situations. In this case, you can consider using non-blocking synchronous I/O or asynchronous I/O. Non-blocking synchronous I/O technologies include the use of select(), poll(), epoll and other mechanisms. The technology of asynchronous I/O is to use AIO.

From the perspective of the I/O event dispatching mechanism, it is inappropriate to use select() because it supports a limited number of concurrent connections (usually within 1024). If you consider performance, poll() is also inappropriate. Although it can support a higher number of TCP concurrencies, due to its "polling" mechanism, when the number of concurrencies is high, its operating efficiency is quite low, and there may be I/O events are distributed unevenly, causing I/O "starvation" on some TCP connections. If you use epoll or AIO, there is no problem mentioned above (the early implementation of AIO technology in the Linux kernel was achieved by creating a thread in the kernel for each I/O request. This implementation mechanism works well in the case of high concurrent TCP connections. In fact, there are serious performance problems when using it. But in the latest Linux kernel, the implementation of AIO has been improved).

To sum up, when developing Linux applications that support high concurrent TCP connections, you should try to use epoll or AIO technology to achieve I/O control on concurrent TCP connections, which will improve the program's Support for high concurrent TCP connections provides efficient I/O guarantees.

After such optimized configuration, the server's TCP concurrent processing capability will be significantly improved. The above configuration is for reference only. If used in a production environment, please observe and adjust according to your actual situation.

The above is the detailed content of Linux system and kernel parameter optimization under high concurrency conditions. For more information, please follow other related articles on the PHP Chinese website!