Home >Operation and Maintenance >Linux Operation and Maintenance >How to solve some common problems caused by dealing with high concurrency in server maintenance
Let’s follow the scenario here. After all, the scenario is the best way to experience practicality. First, let’s talk about the server configuration and environment
Alibaba Cloud ECS cloud host, 8G memory, 4-core CPU, 20M bandwidth, 20G system disk + 200G data disk, CentOS6.564 bit, installed integrated lnmp environment
Scenario: WeChat sends red envelopes
This scenario is very common. Generally, customers will push an advertisement from the WeChat official account on the hour. At this time, the server concurrency is about 3000 to Around 5,000. Speaking of which, this is not actually considered high concurrency, but the server still crashed and it took about 5 minutes to return to normal. This is a bit inappropriate, let’s analyze the reasons. The CPU utilization is not high and the memory usage is normal. In the Alibaba Cloud control panel, the network egress traffic is full. It seems that the problem is caused by network reasons.
First of all, I checked the static resources and found that most of the pictures were not optimized, so I took them off and performed lossless compression. I probably omitted about 1M in size. After submitting, they still crashed and the server frequently displayed 502.
Check the static resource css and js of the page again, and replace the commonly used js library with CDN to reduce the number of requests. After submission, there is still not much change, and the 502 remains the same.
So check the number of nginx connections and use the command
netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
The result shows
TIME_WAIT 3828SYN_SENT 1FIN_WAIT1 107FIN_WAIT2 27ESTABLISHED 661SYN_RECV 23CLOSING 15LAST_ACK 284
Good boy, TIME_WAITE is very high, be sure to do it here Let’s talk about the meaning of TIME_WAITE: TIME_WAIT: A release has been initialized on the other side. What does this mean? This means that the server has been actively shut down and is waiting for a response from the client. If the client does not respond, a wait will occur and this value will increase. Obviously, we need to reduce the value of TIME_WAIT at this time.
Here you only need to modify some parameters of sysctl.conf. Edit the /etc/sysctl.conf file and check
whether it is such a setting. If the corresponding one cannot be found, add it in the file Just add it at the end. After saving, execute
/sbin/sysctl -p
configuration to take effect.
Continue to check the number of nginx connections after 20 minutes. The result
TIME_WAIT 87SYN_SENT 1FIN_WAIT1 60FIN_WAIT2 19ESTABLISHED 477SYN_RECV 12CLOSING 2LAST_ACK 100
returns to normal and the network bandwidth has also dropped.
But the good times did not last long. When I started grabbing red envelopes at the second hour, 502 appeared again. Checking the process found that mysqld's CPU usage was very high, causing the CPU to be fully loaded and the server to crash. Modify the mysql configuration file and adjust max_connection to 30000. Other related parameters were adjusted and optimized, and the situation was alleviated, but within a few minutes the CPU was fully loaded again.
Weird! So I checked the process in mysql and found that there were frequent SQL queries, and the data volume of the several tables queried was around 100,000. It was judged that it was because no index was set. After consulting the back-end development, it turned out that only the primary key was set. Modify it immediately. Five minutes after submitting it, the CPU dropped and stabilized at around 10%, and 502 no longer appeared.
The above is the detailed content of How to solve some common problems caused by dealing with high concurrency in server maintenance. For more information, please follow other related articles on the PHP Chinese website!