When using the Socket model to implement network communication, you need to go through multiple steps such as creating a Socket, listening to the port, processing connections, and reading and writing requests. Now let's take a closer look at the key operations in these steps to help We analyze the deficiencies in the Socket model.
First of all, when we need to let the server and the client communicate, we can create a listening socket (Listening Socket) that listens to the client connection on the server side through the following three steps:
Call the socket function to create a socket. Under normal circumstances, we call this socket an active socket
Call the bind function to bind the active socket to the IP and listening port of the current server;
Call the listen function to convert the active socket into a listening socket and start listening for client connections.
After completing the above three steps, the server can receive the client’s connection request. In order to receive the client's connection request in a timely manner, we can run a loop process in which the accept function is called to receive the client's connection request.
What you need to note here is that the accept function is a blocking function. That is to say, if there is no client connection request at this time, then the server-side execution process will always be blocked in the accept function. Once a client connection request arrives, accept will no longer block, but process the connection request, establish a connection with the client, and return the Connected Socket.
Finally, the server can receive and process read and write requests on the connected socket just returned by calling the recv or send function, or send data to the client.
Code:
listenSocket = socket(); //调用socket系统调用创建一个主动套接字 bind(listenSocket); //绑定地址和端口 listen(listenSocket); //将默认的主动套接字转换为服务器使用的被动套接字,也就是监听套接字 while(1) { //循环监听是否有客户端连接请求到来 connSocket = accept(listenSocket);//接受客户端连接 recv(connSocket);//从客户端读取数据,只能同时处理一个客户端 send(connSocket);//给客户端返回数据,只能同时处理一个客户端 }
However, from the above code, you may find that although it can achieve communication between the server and the client, every time the program calls the accept function, it only Can handle a client connection. Therefore, if we want to handle multiple concurrent client requests, we need to use multi-threading to handle requests on multiple client connections established through the accept function.
After using this method, we need to create a thread after the accept function returns the connected socket, and pass the connected socket to the created thread, which will be responsible for the connected socket. Subsequent data reading and writing on the word. At the same time, the server-side execution process will call the accept function again and wait for the next client connection.
Multi-threading:
listenSocket = socket(); //调用socket系统调用创建一个主动套接字 bind(listenSocket); //绑定地址和端口 listen(listenSocket); //将默认的主动套接字转换为服务器使用的被动套接字,也就是监听套接字 while(1) { //循环监听是否有客户端连接请求到来 connSocket = accept(listenSocket);//接受客户端连接 pthread_create(processData, connSocket);//创建新线程对已连接套接字进行处理 } processData(connSocket){ recv(connSocket);//从客户端读取数据,只能同时处理一个客户端 send(connSocket);//给客户端返回数据,只能同时处理一个客户端 }
Although this method can improve the concurrent processing capability of the server, the main execution process of Redis is executed by one thread and cannot be improved using multi-threading. Concurrent processing capabilities. Therefore, this method does not work for redis.
Are there any other methods that can help Redis improve the processing capabilities of concurrent clients? This requires the use of the IO multiplexing function provided by the operating system. In the basic Socket programming model, the accept function can only listen to client connections on a listening socket, and the recv function can only wait for requests sent by the client on a connected socket.
Because the Linux operating system is widely used in practical applications, in this lesson, we mainly study the IO multiplexing mechanism on Linux. Select, poll and epoll are the three main forms of IO multiplexing mechanism provided by Linux. Next, we will learn the implementation ideas and usage methods of these three mechanisms respectively. Next, let's explore why Redis often chooses to use the epoll mechanism to implement network communication.
First, let’s understand the programming model of the select mechanism.
But before learning in detail, we need to know what key points we need to master for an IO multiplexing mechanism, which can help us quickly grasp the connections and differences between different mechanisms. In fact, when we learn the IO multiplexing mechanism, we need to be able to answer the following questions: First, what events on the socket will the multiplexing mechanism listen to? Second, how many sockets can the multiplexing mechanism listen to? Third, when a socket is ready, how does the multiplexing mechanism find the ready socket?
An important function in the select mechanism is the select function. For the select function, its parameters include the number of monitored file descriptors __nfds, the three collections of monitored descriptors readfds, writefds, exceptfds, and the timeout timeout for blocking waiting during monitoring. select function prototype:
int select(int __nfds, fd_set *__readfds, fd_set *__writefds, fd_set *__exceptfds, struct timeval *__timeout)
What you need to note here is that Linux will have a file descriptor for each socket, which is a non-negative integer, used to uniquely identify the socket. It is common practice in Linux to use file descriptors as arguments in functions of the multiplexing mechanism. The function finds the corresponding socket through the file descriptor to implement operations such as monitoring, reading and writing.
The three parameters of the select function specify the set of file descriptors that need to be monitored, which actually represents the set of sockets that need to be monitored. So why are there three sets?
关于刚才提到的第一个问题,即多路复用机制监听的套接字事件有哪些。select 函数使用三个集合,表示监听的三类事件,分别是读数据事件,写数据事件,异常事件。
我们进一步可以看到,参数 readfds、writefds 和 exceptfds 的类型是 fd_set 结构体,它主要定义部分如下所示。其中,fd_mask类型是 long int 类型的别名,__FD_SETSIZE 和 __NFDBITS 这两个宏定义的大小默认为 1024 和 32。
所以,fd_set 结构体的定义,其实就是一个 long int 类型的数组,该数组中一共有 32 个元素(1024/32=32),每个元素是 32 位(long int 类型的大小),而每一位可以用来表示一个文件描述符的状态。了解了 fd_set 结构体的定义,我们就可以回答刚才提出的第二个问题了。每个描述符集合都可以被 select 函数监听 1024 个描述符。
首先,我们在调用 select 函数前,可以先创建好传递给 select 函数的描述符集合,然后再创建监听套接字。而为了让创建的监听套接字能被 select 函数监控,我们需要把这个套接字的描述符加入到创建好的描述符集合中。
接下来,我们可以使用 select 函数并传入已创建的描述符集合作为参数。程序在调用 select 函数后,会发生阻塞。一旦 select 函数检测到有就绪的描述符,会立即终止阻塞并返回已就绪的文件描述符数。
那么此时,我们就可以在描述符集合中查找哪些描述符就绪了。然后,我们对已就绪描述符对应的套接字进行处理。比如,如果是 readfds 集合中有描述符就绪,这就表明这些就绪描述符对应的套接字上,有读事件发生,此时,我们就在该套接字上读取数据。
而因为 select 函数一次可以监听 1024 个文件描述符的状态,所以 select 函数在返回时,也可能会一次返回多个就绪的文件描述符。我们可以使用循环处理流程,对每个就绪描述符对应的套接字依次进行读写或异常处理操作。
select函数有两个不足
首先,select 函数对单个进程能监听的文件描述符数量是有限制的,它能监听的文件描述符个数由 __FD_SETSIZE 决定,默认值是 1024。
其次,当 select 函数返回后,我们需要遍历描述符集合,才能找到具体是哪些描述符就绪了。这个遍历过程会产生一定开销,从而降低程序的性能。
poll 机制的主要函数是 poll 函数,我们先来看下它的原型定义,如下所示:
int poll(struct pollfd *__fds, nfds_t __nfds, int __timeout)
其中,参数 *__fds 是 pollfd 结构体数组,参数 __nfds 表示的是 *__fds 数组的元素个数,而 __timeout 表示 poll 函数阻塞的超时时间。
pollfd 结构体里包含了要监听的描述符,以及该描述符上要监听的事件类型。从 pollfd 结构体的定义中,我们可以看出来这一点,具体如下所示。pollfd 结构体中包含了三个成员变量 fd、events 和 revents,分别表示要监听的文件描述符、要监听的事件类型和实际发生的事件类型。
pollfd 结构体中要监听和实际发生的事件类型,是通过以下三个宏定义来表示的,分别是 POLLRDNORM、POLLWRNORM 和 POLLERR,它们分别表示可读、可写和错误事件。
了解了 poll 函数的参数后,我们来看下如何使用 poll 函数完成网络通信。这个流程主要可以分成三步:
第一步,创建 pollfd 数组和监听套接字,并进行绑定;
第二步,将监听套接字加入 pollfd 数组,并设置其监听读事件,也就是客户端的连接请求;
第三步,循环调用 poll 函数,检测 pollfd 数组中是否有就绪的文件描述符。
而在第三步的循环过程中,其处理逻辑又分成了两种情况:
如果是连接套接字就绪,这表明是有客户端连接,我们可以调用 accept 接受连接,并创建已连接套接字,并将其加入 pollfd 数组,并监听读事件;
如果是已连接套接字就绪,这表明客户端有读写请求,我们可以调用 recv/send 函数处理读写请求。
其实,和 select 函数相比,poll 函数的改进之处主要就在于,它允许一次监听超过 1024 个文件描述符。但是当调用了 poll 函数后,我们仍然需要遍历每个文件描述符,检测该描述符是否就绪,然后再进行处理。
First of all, the epoll mechanism uses the epoll_event structure to record the file descriptors to be monitored and the event types to be monitored. This is similar to the pollfd structure used in the poll mechanism.
So, for the epoll_event structure, it contains the epoll_data_t union variable and the events variable of integer type. There is a member variable fd in the epoll_data_t union that records file descriptors, and the events variable will take on different macro definition values to represent the event types that the file descriptors in the epoll_data_t variable are concerned about. For example, some common event types include the following These types.
EPOLLIN: Read event, indicating that the socket corresponding to the file descriptor has data to read.
EPOLLOUT: Write event, indicating that the socket corresponding to the file descriptor has data to write.
EPOLLERR: Error event, indicating that the file descriptor is wrong for the socket.
When using the select or poll function, after creating the file descriptor set or pollfd array, we can add the file descriptors we need to monitor to the array.
But for the epoll mechanism, we need to call the epoll_create function first to create an epoll instance. This epoll instance maintains two structures internally, which record file descriptors to be monitored and ready file descriptors. For ready file descriptors, they will be returned to the user program for processing.
So, when we use the epoll mechanism, we don't need to traverse and query which file descriptors are ready like using select and poll. Therefore, epoll is more efficient than select and poll.
After creating the epoll instance, we need to use the epoll_ctl function to add a listening event type to the monitored file descriptor, and use the epoll_wait function to obtain the ready file descriptor.
Now we understand how to use the epoll function. In fact, it is precisely because epoll can customize the number of monitored descriptors and directly return ready descriptors that when Redis designs and implements the network communication framework, it is based on functions such as epoll_create, epoll_ctl and epoll_wait in the epoll mechanism. Read and write events have been encapsulated and developed to implement an event-driven framework for network communication, so that although Redis runs in a single thread, it can still efficiently handle high-concurrency client access.
The Reactor model is a programming model used by the network server to handle high-concurrency network IO requests. Model features:
Three types of processing events, namely connection events, write events, and read events;
Three key roles, namely reactor, acceptor, and handler.
The Reactor model deals with the interaction process between the client and the server, and these three types of events correspond to the different types of requests triggered on the server side during the interaction between the client and the server. Pending events:
When a client wants to interact with the server, the client will send a connection request to the server to establish a connection, which corresponds to a link on the server. Event
Once the connection is established, the client will send a read request to the server to read the data. When the server processes a read request, it needs to write data back to the client, which corresponds to the server-side write event
No matter the client sends a read or write request to the server, the server The request content needs to be read from the client, so here, the read or write request corresponds to the read event on the server side
Three key roles:
First, the connection event is handled by the acceptor, which is responsible for receiving the connection; after the acceptor receives the connection, it will create a handler for processing subsequent read and write events on the network connection;
Secondly, read and write events are handled by the handler;
Finally, in high concurrency scenarios, connection events and read and write events will occur at the same time, so we need to have a role Specialized in monitoring and distributing events, this is the reactor role. When there is a connection request, the reactor will hand over the generated connection event to the acceptor for processing; when there is a read or write request, the reactor will hand over the read and write events to the handler for processing.
So, now that we know that these three roles interact around the monitoring, forwarding and processing of events, how can we implement these three when programming? What about the interaction? This is inseparable from event driving.
When implementing the Reactor model, the overall code control logic that needs to be written is called an event-driven framework. The event-driven framework consists of two parts: event initialization and the main loop of event capture, offloading and processing. in short.
Event initialization is executed when the server program starts. Its main function is to create the event type that needs to be monitored and the handler corresponding to this type of event. Once the server completes initialization, event initialization is completed accordingly, and the server program needs to enter the main loop of event capture, distribution, and processing.
Use a while loop as the main loop. Then in this main loop, we need to capture the event that occurred, determine the event type, and based on the event type, call the event handler created during initialization to actually handle the event.
For example, when a connection event occurs, the server program needs to call the acceptor processing function to create a connection with the client. When a read event occurs, it indicates that a read or write request has been sent to the server. The server program will call a specific request processing function to read the request content from the client connection, thereby completing the processing of the read event.
The basic working mechanism of the Reactor model: Different types of requests from the client will trigger three types of events: connection, reading, and writing on the server side. The monitoring, distribution, and processing of these three types of events are performed by reactor, acceptor, and handler. It is completed by three types of roles, and then these three types of roles will implement interaction and event processing through the event-driven framework.
The above is the detailed content of What is the event-driven model of Redis?. For more information, please follow other related articles on the PHP Chinese website!