1. Basic knowledge
1.1 Swoole
Swoole is php asynchronous for production environment Network communication engine, PHP developers can use Swoole to develop high-performance server services. The server part of Swoole has a lot of content and involves many knowledge points. This article only gives a brief overview of its server. The specific implementation details will be introduced in detail in subsequent articles.
Recommended (free):swoole
1.2 Network Programming
1. Network communication is It refers to starting one (or more) processes on one (or more) machines, listening on one (or more) ports, and following a certain protocol (it can be the standard protocol http, dns; it can also be a self-defined protocol) Exchange information with clients.
2. Most of the current network programming is based on tcp, udp or higher-layer protocols. The server part of Swoole is based on tcp and udp protocols.
3. Programming using udp is relatively simple. This article mainly introduces network programming based on the tcp protocol
4. TCP network programming mainly involves 4 types of events
●Connection establishment: Mainly refers to the client initiating the connection (connect) and the server accepting the connection (accept)
●Message arrival: The server receives the data sent by the client. This event is the most important event in TCP network programming. When the server handles this type of event, it can use blocking or non-blocking. In addition, the server You also need to consider issues such as subpackaging and application layer buffers
●Message sent successfully:Successfully sent means that the application layer successfully sends the data to the kernel socket words in the send buffer does not mean that the client successfully received the data. For low-traffic services, the data can usually be sent all at once, and there is no need to care about such events. If all the data cannot be sent to the kernel buffer at once, you need to care about whether the message is sent successfully (for blocking programming, the sending is successful after the system call (write, writev, send, etc.) returns, while for non-blocking programming, you need to consider the actual situation Whether the written data is consistent with expectations)
●Connection disconnection:You need to consider the client disconnection (read returns 0) and the server disconnection (close, shutdown)
5. The process of establishing a tcp connection is as shown in the figure below
● In the figure, ACK and SYN represent flag bits, seq and ack are The sequence number of the tcp packet and the confirmation sequence number
6. The process of tcp disconnection is as shown below
● The above picture is considered The situation where the client actively disconnects and the server actively disconnects is also similar
● In the figure, FIN and ACK represent flag bits, seq and ack are the sequence number of the tcp packet and the confirmation sequence number
1.3 Inter-process communication
1. Communication between processes includes unnamed pipes (pipe), named pipes (fifo), signals (signal), and semaphores (semaphore), socket (socket), shared memory (shared memory) and other methods
2. Swoole uses unix domain sockets (a type of socket) for communication between multiple processes Communication (referring to Swoole internal processes)
1.4 socketpair
1. socketpair is used to create a socket pair, similar to pipe , The difference is that pipe is a one-way communication, and two-way communication needs to be created twice. Socketpair can be called once to achieve two-way communication. In addition, since sockets are used, the method of data exchange can also be defined
2. socketpair system call
- After the call is successful, sv[0] and sv[1] store a file descriptor respectively
- Write to sv[0], you can read from sv[1]
- Write to sv[1], you can read from sv[0]
- The process calls socketpair After that, fork the child process, and the child process will inherit the two file descriptors sv[0] and sv[1] by default, thus enabling communication between the parent and child processes. For example, the parent process writes to sv[0], and the child process reads from sv[1]; the child process writes to sv[1], and the parent process reads from sv[0]
1.5 Daemon
1. The daemon is a special background process, which is separated from the terminal and used for periodic Perform a certain task
2. Process group
- Each process belongs to a process group
- Each process group has a process group number, that is The process number (PID) of the group leader
- A process can only set the process group number for itself or its child processes
3. Session
- A session can contain multiple process groups. Among these process groups, there can be at most one foreground process group (or none), and the rest are background process groups.
- A session can only have at most one process group. A control terminal
- When a user logs in through the terminal or the network, a new session will be created
- The process calls the system call setsid to create a new session. The process calling setsid cannot be a certain process. The leader of the group. After the setsid call is completed, the process becomes the first process (lead process) of the session and becomes the leader of a new process group. If the process previously had a controlling terminal, the process's connection with the terminal is also disconnected
4. How to create a daemon process
- After forking the child process, the parent process exits, the child process executes setsid, and the child process can become a daemon process. In this way, the child process is the leader process of the session and can reopen the terminal. At this time, it can fork again. The child process generated by fork can no longer open the terminal (only the leader process of the session can open the terminal). The second fork is not necessary, it is just to prevent the child process from opening the terminal again
- Linux provides the daemon function (this function is not a system call, but a library function) for creating a daemon process
1.6 Swoole tcp server example
- When the above code is executed in cli mode, opcode is generated after lexical analysis and syntax analysis , and then handed over to the zend virtual machine for execution
- When the zend virtual machine executes $serv->start(), it starts the Swoole server
- The event callback set in the above code is in the worker process Executed in, the Swoole server model will be introduced in detail later.
2. Swoole server
2.1 base Mode
1. Description
- base mode adopts a multi-process model, which is consistent with nginx. Each process has only one thread, and the main process is responsible for managing the worker process. The worker process is responsible for listening to the port, accepting connections, processing requests and closing connections
- If multiple processes listen to the port at the same time, there will be a thundering group problem. In the kernel version of Linux before 3.9, Swoole does not solve the thundering group problem
- linux kernel 3.9 and subsequent versions provide a new socket parameter SO_REUSEPORT, which allows multiple processes to bind to the same port. When the kernel receives a new connection request, it will wake up one of them for processing. The kernel Load balancing will also be done at the level, which can solve the above-mentioned thundering group problem. Swoole has also added this parameter
- In base mode, the reactor_number parameter has no actual effect
- If the number of worker processes is set to 1, then the worker process will not be forked, and the main process will handle the request directly. This mode is suitable for debugging
2. The startup process
- php code is executed to $serv- >When start(), the main process enters the int swServer_start(swServer *serv) function, which is responsible for starting the server
- In the function swServer_start, swReactorProcess_start will be called, and this function will fork multiple worker processes
- The main process and the worker process each enter their own event loops to handle various events
2.2 process mode
1. Description
- This mode is multi-process and multi-threaded, with a main process, a manager process, a worker process, and a task_worker process.
- There are multiple threads under the main process. The main thread is responsible for accepting connections and then handing them over to react. Threads handle requests. The react thread is responsible for receiving data packets, forwarding the data to the worker process for processing, and then processing the data returned by the worker process
- manager process. This process is a single thread and is mainly responsible for managing the worker process, similar to nginx. Main process, when the worker process exits abnormally, the manager process is responsible for re-forking a worker process
- worker process, which is a single-threaded process and is responsible for specifically processing requests
- task_worker process, used for processing For time-consuming tasks, it is not enabled by default
- The worker process communicates with the react thread in the main process using domain sockets, and there is no communication between worker processes
2. Startup process
- Swoole server startup entry: swServer_start function
- If daemon mode is set, check the necessary parameters After that, first turn yourself into a daemon process and then fork the manager process, and then create a reactor thread
- The main process first forks out the manager process, and the manager process is responsible for forking out the worker process and task_worker process. After the worker process enters int swWorker_loop
(swServer *serv, int worker_id), that is, it enters its own event loop, the same goes for task_worker, which enters its own event loop
The main process pthread_create exits the react thread. The main thread and the react thread each enter their own event loop. The reactor thread executes static int swRea-torThread_loop (swThreadParam *param) and waits for processing events
3. Structure diagram
- The Swoole process mode structure is shown in the figure below.
The above figure does not consider the task_worker process. By default, the number of task_worker processes is 0
3. Request processing process (process mode)
3.1 Communication between reactor thread and worker process
1. The communication between the Swoole master process and the worker process is as shown in the figure below
- Swoole uses SOCK_DGRAM instead of SOCK_STREAM , this is because each reactor thread is responsible for processing multiple requests. After receiving the request, the reactor will forward the information to the worker process, which is responsible for processing. If SOCK_STREAM is used, the worker process cannot subcontract TCP and then process the request.
- The swFactoryProcess_start function will create a corresponding number of socket pairs based on the number of worker processes for communication between the reactor thread and the worker process (see swPipeUnsock_create function for details)
2. Assume reactor There are 2 threads and 3 worker processes. The communication between reactor and worker is as shown in the figure below
- Each reactor thread is responsible for monitoring several workers Process, each worker process has only one reactor thread listening (reactor_num <= worker_num). By default, Swoole uses worker_process_id % reactor_num to allocate the worker process and hand it over to the corresponding reactor thread for monitoring.
- The reactor thread will process the data after receiving the data of a worker process. It is worth noting that this reactor thread may It is not the reactor thread that sent the request.
3. Data packet for communication between reactor thread and worker process
##3.2 Request processing
1. The main thread in the master process is responsible for listening to the port (
listen), accepting connections (
accept, generating a fd), and after accepting the connection, allocate the request to the reactor thread, by default through fd % reactor_number is allocated, and then the fd is added to the corresponding reactor thread through
epoll_ctl. When first added, the write event is monitored, because the socket write buffer created by the newly accepted connection is empty, so it must be writable. It will be triggered immediately, and then the reactor thread will perform some initialization operations
There are situations where multiple threads operate an epollfd (created through the system call
- epoll_create
) at the same time
It is thread-safe to call
- epoll_ctl
by multiple threads at the same time (corresponding to one epolld). If one thread is executing, other threads will be blocked (because the red-black tree underlying epoll needs to be operated at the same time)
It is also thread-safe to call
- epoll_wait
by multiple threads at the same time, but an event may be received by multiple threads at the same time. In practice, it is not recommended that multiple threads call
epoll_waitat the same time epollfd. This situation does not exist in Swoole. Each reactor thread in Swoole has its own epollfd
. One thread calls
- epoll_wait
, and one thread calls
epoll_ctl. According to man manual, if
epoll_ctlthe newly added fd is ready, the thread executing
epoll_waitwill become non-blocking (you can view the relevant content through man
epoll_wait)
#2. The write event of fd in the reactor thread is triggered, and the reactor thread is responsible for processing. If it is found that it is the first time to join and there is no data to write, the read event will be started. Event monitoring, ready to accept data sent by the client
3. The reactor thread reads the user's request data,
After receiving the data of a request, it forwards the data to the worker process, by default It is allocated through fd % worker_number
The data packet sent by the reactor to the worker process will contain a header, and the reactor information is recorded in the header
- If the data sent is too If it is large, the data needs to be fragmented. Due to space limitations, data fragmentation will be described in detail later
- There may be situations where multiple reactor threads send data to the same worker process at the same time, so Swoole adopts the SOCK_DGRAM mode. Communicate with the worker process. Through the header of each data packet, the worker process can distinguish which reactor thread sent the data, and can also know which request it is.
4. The worker process receives the reactor After the data packet is sent, it is processed. After the processing is completed, the request result is sent to the main process.
The data packet sent by the worker process to the main process will also include a header. When the reactor thread receives After receiving the data packet, you can know the corresponding reactor thread, requested fd and other information
5. The main process receives the data packet sent by the worker process, which will trigger a reactor thread for processing
- This reactor thread is not necessarily the reactor thread that sent the request to the worker process before.
- Each reactor thread of the main process is responsible for monitoring the data sent by the worker process. package, each data packet sent by the worker will only be monitored by one reactor thread, so only one reactor thread will be triggered
6. The reactor thread processes the request processing results sent by the worker process. If it is directly To send data to the client, you can send it directly. If you need to change the listening status of this connection (for example,close
), you need to first find the reactor thread that monitors this connection, and then change the listening status of this connection ( By callingepoll_ctl
)
- reactor processing thread and reactor listening thread may not be the same thread
- reactor listening thread is responsible for monitoring the data sent by the client, and then Forwarded to the worker process
- reactor processing thread is responsible for monitoring the data sent by the worker process to the main process, and then sends the data to the client
four . gdb debugging
4.1 Process mode startup
4.2 Base mode startup
5. Summary and Thoughts
1. This article mainly introduces the two modes of Swoole server: base mode and process mode, explained in detail This article introduces the two modes of network programming models, and focuses on the method of inter-process communication, request processing flow, etc. in process mode.
2. In process mode, why not create multiple processes directly in the main process? Threads process requests directly (which can avoid the overhead of inter-process communication), but create a manager process, and then the manager process creates a worker process, and the worker process handles the request?
- personal I think it may be that PHP's support for multi-threading is not very friendly, and PHP mostly only performs single-thread programming
- Although the TSRM provided by ZendVM also supports multi-threaded environments, it is actually a solution to isolate memory by thread. , multi-threading is meaningless
3. In process mode, each reactor thread in the main process can process multiple requests at the same time. Multiple requests are processed concurrently. We look at it from two dimensions Look
- From the perspective of the main process, the main process processes multiple requests at the same time. When all request packets are received, they are forwarded to the worker process for processing
- From a worker From a process perspective, the requests received by this worker process are serial. By default, the worker process also processes requests serially. If a single request is blocked (Swoole's worker process will call back the event processing function written by phper, this function may Blocking), subsequent requests cannot be processed. This is the problem of queue blocking. In this case, Swoole's coroutine can be used. Through coroutine scheduling, when a single request is blocked, the worker process can continue to process other requests
4. When using Swoole to create a tcp server, since tcp is a byte stream protocol, it needs to be subcontracted, and Swoole cannot subcontract without knowing the communication protocol between the client and the server. In process mode , the data handed over by the reactor to the worker process can only be byte streams, which need to be processed by the user. Of course, in general, there is no need to build a protocol by yourself. Using tcp server, Swoole already supports Http, Https and other protocols
The above is the detailed content of A brief analysis of Swoole server. For more information, please follow other related articles on the PHP Chinese website!