An example tutorial on building a non-blocking download program using Python's Twisted framework

The first poetry server supported by twisted
Although Twisted is used to write server code in most cases, in order to start as simple as possible from the beginning, we first start with a simple client.
Let’s try using Twisted’s client. The source code is in twisted-client-1/ First, open the three servers as before:

python blocking-server/ --port 10000 poetry/ecstasy.txt --num-bytes 30
python blocking-server/ --port 10001 poetry/fascination.txt
python blocking-server/ --port 10002 poetry/science.txt
and run the client:

python twisted-client-1/ 10000 10001 10002
You will see the command line on the client Printing out:

Task 1: got 60 bytes of poetry from
Task 2: got 10 bytes of poetry from
Task 3: got 10 bytes of poetry from
Task 1: got 30 bytes of poetry from
Task 3: got 10 bytes of poetry from
Task 2: got 10 bytes of poetry from
Task 1: 3003 bytes of poetry
Task 2: 623 bytes of poetry
Task 3: 653 bytes of poetry
Got 3 poems in 0:00:10.134220
is close to what our non-blocking mode client without Twisted prints. This is not surprising since they work the same way.
Next, let’s take a closer look at its source code.
Note: When we start learning to use Twisted, we will use some low-level Twisted APIs. This is done to remove the abstraction layer of Twisted so that we can learn Tiwsted from the inside out. But this means that the APIs we use in learning may not be seen in actual applications. Just remember this: the preceding code is just an exercise, not an example of writing real software.
As you can see, a set of PoetrySocket instances are first created. When PoetrySocket is initialized, it creates a network socket as its own attribute field to connect to the server, and selects non-blocking mode:

self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Eventually we will improve to not using sockets On an abstract level, but we still need to use it here. After creating the socket, PoetrySocket passes itself to the reactor through the method addReader:

# tell the Twisted reactor to monitor this socket for reading
from twisted.internet import reactor
This method provides Twisted with a file descriptor to monitor the data to be sent. Why don't we pass Twisted a file descriptor or callback function but an object instance? And there is no code inside Twisted related to this poetry service, so how does it know how to interact with our object instance? Trust me, I've checked it out, open the twisted.internet.interfaces module and join me in figuring out what's going on.

Twisted interface
There are many sub-modules called interfaces inside twisted. Each defines a set of interface classes. Since version 8.0, Twisted uses zope.interface as the base class for these classes. But we won't discuss the details here. We're only interested in its Twisted subclasses, the ones you see here.
One of the core purposes of using interfaces is documentation. As a python programmer, you must know Duck Typing. (Python philosophy: "If it looks like a duck, sounds like a duck, then treat it as a duck." Therefore, the interface of python objects strives to be simple and unified, similar to the interface-oriented programming ideas in other languages.) Read twisted.internet .interfaces finds the addReader definition of the method, and its definition can be found in IReactorFDSet:

def addReader(reader):
  I add reader to the set of file descriptors to get read events for.
  @param reader: An L{IReadDescriptor} provider that will be checked for
          read events until it is removed from the reactor with
  @return: C{None}.
IReactorFDSet is an interface implemented by Twisted reactor. Therefore, any Twisted reactor will have an addReader method, which works as described above. The reason why this method declaration does not have a self parameter is because it only cares about a public interface definition, and the self parameter is only part of the interface implementation (when calling it, a self parameter is not explicitly passed in). Interface classes are never instantiated or implemented as base classes.

Technically speaking, IReactorFDSet will only be used by reactor to listen to file descriptors. As far as I know, all implemented reactors now implement this interface.
Using interfaces is not just for documentation. zope.interface allows you to explicitly declare that a class implements one or more interfaces and provides a mechanism for checking these implementations at runtime. It also provides a proxy mechanism, which can dynamically provide a class that does not implement an interface directly with the interface. But we won’t do in-depth study here.
You may have noticed the similarity between interfaces and the recent additions to virtual base classes in Python. We will not analyze the similarities and differences between them here. If you are interested, you can read an article on this topic written by Glyph, the founder of the Python project.
According to the description of the document, it can be seen that the reader parameter of addReader is to implement the IReadDescriptor interface. This means our PoetrySocket must do the same.
Reading interface module we can see the following code:

class IReadDescriptor(IFileDescriptor):
  def doRead():
    Some data is available for reading on your descriptor.
Copy after login

At the same time you will see that there is a doRead method in our PoetrySocket class. When it is called by Twisted's reactor, it reads data from the socket asynchronously. Therefore, doRead is actually a callback function, but it is not passed directly to the reactor, but an object instance that implements this method is passed. This is also the convention in the Twisted framework - instead of passing the function that implements an interface directly, you pass the object that implements it. In this way we can pass a set of related callback functions through a parameter. And it is also possible to communicate between callback functions through data stored in the object.
What about implementing other callback functions in PoetrySocket? Note that IReadDescriptor is a subclass of IFileDescriptor. This means that anyone who implements IReadDescriptor must implement IFileDescriptor. If you read the code carefully you will see the following:

class IFileDescriptor(ILoggingContext):
  A file descriptor.
  def fileno():
  def connectionLost(reason):
我们使用Twisted的异步客户端和前面的没有使用Twisted的异步客户非常的相似。两者都要连接它们自己的socket,并以异步的方式从中读取数据。最大的区别在于:使用Twisted的客户端并没有使用自己的select循环-而使用了Twisted的reactor。 doRead回调函数是非常重要的一个回调。Twisted调用它来告诉我们已经有数据在socket接收完毕。我可以通过图7来形象地说明这一过程:

An example tutorial on building a non-blocking download program using Pythons Twisted framework


python twisted-client-1/ 10000 10001 10002
Task 1: got 3003 bytes of poetry from
Task 3: got 653 bytes of poetry from
Task 2: got 623 bytes of poetry from
Task 1: 3003 bytes of poetry
Task 2: 623 bytes of poetry
Task 3: 653 bytes of poetry
Got 3 poems in 0:00:10.132753
Building the client abstractly
First of all, this client actually has such boring code as creating a network port and receiving data at the port. Twisted is supposed to implement these routine functions for us, saving us from having to implement them ourselves every time we write a new program. This is particularly useful, as it frees us from some of the tricky exception handling involved in asynchronous I/O (see the previous client), which involves more tricky details if it is to be cross-platform. If you have free time one afternoon, you can look through Twisted's WIN32 implementation source code and see how many little threads there are to handle cross-platform.
Another problem is related to error handling. When the Twisted client running version 1 downloads poetry from a port that is not served, it crashes. Of course we can fix this error, but it's easier to handle these types of errors through Twisted's APIs, which we'll introduce below.
Finally, that client cannot be reused. What if there is another module that needs to download poetry through our client? How do people know that your poems have been downloaded? We cannot use a method to simply download a poem and then pass it on to others, but leave them in a waiting state before. This is indeed a problem, but we are not going to address it in this section—it will definitely be addressed in a future section.
We will use some high-level APIs and interfaces to solve the first and second problems. The Twisted framework is loosely composed of many abstraction layers. Therefore, learning Twisted also means learning what functions these layers provide, such as what APIs, interfaces and instances are available for each layer. Next, we will analyze the most important parts of Twisted to get a better feel for how Twisted is organized. Once you are familiar with the overall structure of Twisted, learning new parts will be much easier.
Generally speaking, each Twisted abstraction is only related to a specific concept. For example, the client in Part 4 uses IReadDescriptor, which is an abstraction of "a file descriptor that can read bytes". An abstraction often specifies the behavior of objects that want to implement the abstraction (that is, implement the interface) by defining an interface. When learning new Twisted abstract concepts, the most important thing to remember is:
Most high-level abstractions are built on the basis of low-level abstractions, and few establish separate portals.
So when you learn a new Twisted abstraction, always remember what it does and doesn't do. In particular, if an early abstraction A implements feature F, then feature F is unlikely to be implemented by any other abstraction. In addition, if another abstraction requires the F feature, it will use A instead of implementing F itself. (Usually, B may inherit A or obtain a reference to an instance of A)
The network is very complex, so Twisted contains many abstract concepts. By starting with a low-level abstraction, we hope to see more clearly how the various parts of a Twisted program are organized.
Core loop body

The first abstraction we need to learn, and the most important one in Twisted, is reactor. At the center of every program built through Twisted, no matter how many layers your program has, there will always be a reactor loop that drives the program without stopping. There is no more basic support than reactor. In fact, other parts of Twisted (that is, except for the reactor loop) can be understood this way: they are all here to assist X to better use reactor. content. Although it is possible to insist on using the low-level APIs like the previous client, if we insist on doing that, then we will have to implement a lot of content ourselves. And at a higher level, it means we can write a lot less code.
But when thinking and dealing with problems from the outside, it is easy to forget the existence of reactor. In any Twisted program of any common size, there will indeed be very little direct interaction with reactor APIs. The same goes for low-level abstractions (i.e. we rarely interact with them directly). The file descriptor abstraction we used in the previous client is so well summarized by higher-level abstractions that we rarely encounter it in real Twisted programs. (They are still used internally, we just can't see it)
As for the file descriptor abstraction message, this is not a problem. Let Twisted take the helm of asynchronous I/O processing so we can focus more on the problem we're actually trying to solve. But it's different with reactor, it will never go away. When you choose to use Twisted, it means you choose to use the Reactor pattern, and it means you need "interactive" programming using callbacks and multitasking.



Protocol Factories
因此每个连接需要一个自己的Protocol,而且这个Protocol是我们自己定义的类的实例。由于我们会将创建连接的工作交给Twisted来完成,Twisted需要一种方式来为一个新的连接创建一个合适的协议。创建协议就是Protocol Factories的工作了。
也许你已经猜到了,Protocol Factory的API由IProtocolFactory来定义,同样在interfaces模块中。Protocol Factory就是Factory模式的一个具体实现。buildProtocol方法在每次被调用时返回一个新Protocol实例,它就是Twisted用来为新连接创建新Protocol实例的方法。


factory = PoetryClientFactory(len(addresses))
from twisted.internet import reactor
for address in addresses:
  host, port = address
  reactor.connectTCP(host, port, factory)
我们需要关注的是connectTCP这个函数。前两个参数的含义很明显,不解释了。第三个参数是我们自定义的PoetryClientFactory类的实例对象。这是一个专门针对诗歌下载客户端的Protocol Factory,将它传递给reactor可以让Twisted为我们创建一个PoetryProtocol实例。
值得注意的是,从一开始我们既没有实现Factory也没有去实现Protocol,不像在前面那个客户端中我们去实例化我们PoetrySocket类。我们只是继承了Twisted在twisted.internet.protocol 中提供的基类。Factory的基类是twisted.internet.protocol.Factory,但我们使用客户端专用(即不像服务器端那样监听一个连接,而是主动创建一个连接)的ClientFactory子类来继承。

def buildProtocol(self, address):
  proto = ClientFactory.buildProtocol(self, address)
  proto.task_num = self.task_num
  self.task_num += 1
  return proto
class PoetryClientFactory(ClientFactory):
  task_num = 1
  protocol = PoetryProtocol # tell base class what proto to build
值得注意的是,虽然在Protocol中有一个属性指向生成其的Protocol Factory,在Factory中也有一个变量指向一个Protocol类,但通常来说,一个Factory可以生成多个Protocol。

An example tutorial on building a non-blocking download program using Pythons Twisted framework


def dataReceived(self, data):
  self.poem += data
  msg = 'Task %d: got %d bytes of poetry from %s'
  print msg % (self.task_num, len(data), self.transport.getHost())
def dataReceived(self, data):
python twisted-client-2/ 10000

File "twisted-client-2/", line 125, in
... # I removed a bunch of lines here
File ".../twisted/internet/", line 463, in doRead # Note the doRead callback
  return self.protocol.dataReceived(data)
File "twisted-client-2/", line 58, in dataReceived
看见没,有我们在1.0版本客户端的doRead回调函数。我们前面也提到过,Twisted在建立新抽象层会使用已有的实现而不是另起炉灶。因此必然会有一个IReadDescriptor的实例在辛苦的工作,它是由Twisted代码而非我们自己的代码来实现。如果你表示怀疑,那么就看看twisted.internet.tcp中的实现吧。如果你浏览代码会发现,由同一个类实现了IWriteDescriptor与ITransport。因此 IReadDescriptor实际上就是变相的Transport类。可以用图10来形象地说明dateReceived的回调过程:

An example tutorial on building a non-blocking download program using Pythons Twisted framework


def connectionLost(self, reason):  
def poemReceived(self, poem): 
 self.factory.poem_finished(self.task_num, poem)
self.poetry_count -= 1
if self.poetry_count == 0:
def clientConnectionFailed(self, connector, reason):
  print 'Failed to connect to:', connector.getDestination()
