Home  >  Article  >  Backend Development  >  How to use the efficient Python universal object pooling library

How to use the efficient Python universal object pooling library

WBOY
WBOYforward
2023-05-11 16:16:06732browse

The object pool mode is mainly suitable for the following application scenarios:
  • Scenarios with limited resources. For example, in an environment that does not require scalability (physical resources such as CPU and memory are limited), the CPU performance is not strong enough, and the memory is relatively tight. Garbage collection and memory jitter will have a relatively large impact. The memory management efficiency needs to be improved. The responsiveness is better than the throughput. Quantity is more important.

  • A limited number of objects in memory.

  • Objects that are expensive to create.

  • Pool a large number of objects with short lifetimes and low initialization costs to reduce memory allocation and reallocation costs and avoid memory fragmentation.

  • In a dynamic language like Python, GC relies on reference technology to ensure that objects are not recycled prematurely. In some scenarios, it may happen that although it is created, no one will The idle period used causes the object to be recycled. It can be delegated to the object pool for safekeeping.

Pond Introduction

Pond is an efficient general object pool in Python, with the characteristics of good performance, small memory usage and high hit rate. . The ability to automatically recycle based on frequency based on approximate statistics can automatically adjust the number of free objects in each object pool.

Because currently Python does not have a better object pooling library with complete test cases, complete code comments, and complete documentation. At the same time, the current mainstream object pooling library does not have a relatively intelligent automatic recycling mechanism.

Pond may be the first object pooling library in Python with complete test cases disclosed by the community, a coverage rate of more than 90%, complete code comments, and complete documentation.

Pond is inspired by Apache Commons Pool, Netty Recycler, HikariCP, and Caffeine, and combines the advantages of many.

Secondly, Pond counts the usage frequency of each object pool in a very small memory space by using approximate counting and automatically recycles it.

When the traffic is relatively random and average, the default policy and weight can reduce the memory usage by 48.85% and the borrowing hit rate is 100%.

How to use the efficient Python universal object pooling library

When the traffic is relatively consistent with the 2/8 law, the default policy and weight can reduce the memory usage by 45.7% and the borrow hit rate is 100%.

How to use the efficient Python universal object pooling library

Design Overview

Pond is mainly composed of three parts: FactoryDict, Counter, PooledObjectTree and a separate recycling thread.

FactoryDict

Using Pond requires implementing the object factory PooledObjectFactory. PooledObjectFactory provides object creation, initialization, destruction, verification and other operations, and is called by Pond.

So in order for the object pool to support storing completely different objects, Pond uses a dictionary to record the name of each factory class and the instantiated object of the factory class it implements.

Each PooledObjectFactory should have the four functions of creating objects, destroying objects, verifying whether the objects are still available, and resetting the objects.

What’s special is that Pond supports automatic reset of objects, because in some scenarios there may be situations where the object needs to be assigned a value first and passed, and then recycled after being passed. In order to avoid contamination, this is recommended. This function can be realized in various scenarios.

Counter

Counter stores an approximate counter.

PooledObjectTree

PooleedObjectTree is a dictionary. Each key corresponds to a first-in-first-out queue. These queues are thread-safe.

Each queue holds multiple PooleedObjects. PooledObject saves the creation time, last loan time, and the actual required object.

Thread safety

Pond's borrowing and recycling are both thread-safe. Python's queue module provides a first-in, first-out (FIFO) data structure suitable for multi-threaded programming. It can be used to safely pass messages or other data between producer and consumer threads.

The lock is handled by the caller, and all multiple threads can safely and easily work on the same Queue instance. The borrowing and recycling of Pond both operate on the queue, so it can basically be considered thread-safe.

Lending mechanism

When using Pond to lend an object, it will first check whether the type of object you want to lend already exists in PooledObjectTree. If it exists, it will Checks whether the object pool of this object is empty, and creates a new one if it is empty.

If there are excess objects in the object pool, queue will be used to pop up an object and verify whether the object is available. If it is unavailable, the corresponding Factory will be automatically called to clean and destroy the object. At the same time, its GC count in Python will be cleared, so that it can be recycled by GC faster, and the next one will be taken continuously until one is available.

If this object is available, it will be returned directly. Of course, whether an object is taken out from the object pool or a new object is created, Counter will be used to increment a count.

Recycling mechanism

When recycling an object, it will determine whether the target object pool exists. If it exists, it will check whether the object pool is full. If it is full, the object to be returned will be automatically destroyed.

Then it will check whether the object has been lent for too long. If it exceeds the configured maximum time, it will also be cleared.

Automatic recycling

Automatic recycling will be executed every once in a while, the default is 300 s. Automatically clean up objects in the object pool that are not used frequently.

Instructions

You can first install the Pond library and reference it in your project.

pip install pondpond
from pond import Pond, PooledObjectFactory, PooledObject

First you need to declare a factory class for the type of object you want to put in. For example, in the following example we want the pooled object to be Dog, so we first declare a PooledDogFactory class, and Implement PooledObjectFactory.

class Dog:
 name: str
 validate_result:bool = True
class PooledDogFactory(PooledObjectFactory):
 def creatInstantce(self) -> PooledObject:
 dog = Dog()
 dog.name = "puppy"
 return PooledObject(dog)
 def destroy(self, pooled_object: PooledObject):
 del pooled_object
 def reset(self, pooled_object: PooledObject) -> PooledObject:
 pooled_object.keeped_object.name = "puppy"
 return pooled_object
 def validate(self, pooled_object: PooledObject) -> bool:
 return pooled_object.keeped_object.validate_result

Then you need to create the Pond object:

pond = Pond(borrowed_timeout=2,
 time_between_eviction_runs=-1,
 thread_daemon=True,
 eviction_weight=0.8)

Pond can pass some parameters in, which represent:

borrowed_timeout: the unit is seconds, borrow the object The maximum period. Objects that exceed the period will be automatically destroyed when returned and will not be put into the object pool.

time_between_eviction_runs: The unit is seconds, the interval between automatic recycling.

thread_daemon: daemon thread, if True, the automatically recycled thread will be closed when the main thread is closed.

eviction_weight: The weight during automatic recycling. This weight will be multiplied by the maximum usage frequency. Objects in the object pool whose usage frequency is less than this value will enter the cleanup step.

Instancing the factory class:

factory = PooledDogFactory(pooled_maxsize=10, least_one=False)

Everything that inherits PooledObjectFactory will have its own constructor that can pass pooled_maxsize and least_one parameters.

pooled_maxsize: The maximum number of objects that can be placed in the object pool generated by this factory class.

least_one: If True, when entering automatic cleanup, the object pool of objects generated by this factory class will retain at least one object.

Register this factory object with Pond. By default, the class name of the factory will be used as the key of PooledObjectTree:

pond.register(factory)

Of course, you can also customize its name. The name will be As the key of PooledObjectTree:

pond.register(factory, name="PuppyFactory")

After successful registration, Pond will automatically start creating objects according to the pooled_maxsize set in the factory until the object pool is filled.

Borrowing and returning objects:

pooled_object: PooledObject = pond.borrow(factory)
dog: Dog = pooled_object.use()
pond.recycle(pooled_object, factory)

Of course you can borrow and return objects by name:

pooled_object: PooledObject = pond.borrow(name="PuppyFactory")
dog: Dog = pooled_object.use()
pond.recycle(pooled_object, name="PuppyFactory")

Completely clean up an object pool:

pond.clear(factory)

Clean an object pool by name:

pond.clear(name="PuppyFactory")

Under normal circumstances, you only need to use the above methods, and object generation and recycling are fully automatic.

The above is the detailed content of How to use the efficient Python universal object pooling library. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete