Introduction
Descriptors (descriptors) are a profound but important black magic in the Python language. They are widely used in the kernel of the Python language. Proficiency in descriptors will benefit Python programmers. Toolbox adds an extra trick. In this article, I will describe the definition of descriptors and some common scenarios, and at the end of the article I will add __getattr__, __getattribute__, and __getitem__, three magic methods that also involve attribute access.
descr__get__(self, obj, objtype=None) --> value descr.__set__(self, obj, value) --> None descr.__delete__(self, obj) --> None
As long as an object attribute (object attribute) defines any one of the above three methods, then this class can be called a descriptor class.
In the following example we create a RevealAcess class and implement the __get__ method. Now this class can be called a descriptor class.
class RevealAccess(object): def __get__(self, obj, objtype): print('self in RevealAccess: {}'.format(self)) print('self: {}\nobj: {}\nobjtype: {}'.format(self, obj, objtype)) class MyClass(object): x = RevealAccess() def test(self): print('self in MyClass: {}'.format(self))
EX1 instance attribute
Next let’s take a look at the meaning of each parameter of the __get__ method. In the following example, self is the instance x of the RevealAccess class, and obj is the MyClass class. Instance m, objtype, as the name suggests, is the MyClass class itself. As can be seen from the output statement, m.x access descriptor x will call the __get__ method.
>>> m = MyClass() >>> m.test() self in MyClass: <__main__.MyClass object at 0x7f19d4e42160> >>> m.x self in RevealAccess: <__main__.RevealAccess object at 0x7f19d4e420f0> self: <__main__.RevealAccess object at 0x7f19d4e420f0> obj: <__main__.MyClass object at 0x7f19d4e42160> objtype: <class '__main__.MyClass'>
EX2 class attribute
If you access attribute x directly through the class, then the obj connection is directly None, which is easier to understand because there is no instance of MyClass.
>>> MyClass.x self in RevealAccess: <__main__.RevealAccess object at 0x7f53651070f0> self: <__main__.RevealAccess object at 0x7f53651070f0> obj: None objtype: <class '__main__.MyClass'>
In the above example, we enumerated the usage of descriptors from the perspective of instance attributes and class attributes. Below we Let’s carefully analyze the internal principle:
If you access instance attributes, it is equivalent to calling object.__getattribute__(), which translates obj.d into type(obj ).__dict__['d'].__get__(obj, type(obj)).
If you are accessing a class attribute, it is equivalent to calling type.__getattribute__(), which translates cls.d into cls.__dict__['d'].__get__( None, cls), converted into Python code is:
def __getattribute__(self, key): "Emulate type_getattro() in Objects/typeobject.c" v = object.__getattribute__(self, key) if hasattr(v, '__get__'): return v.__get__(None, self) return v
Let’s briefly talk about the __getattribute__ magic method. This method will be called unconditionally when we access the attributes of an object. Details I will make an additional supplement at the end of the article about the details such as the difference between __getattr and __getitem__, but we will not delve into it for now.
First of all, descriptors are divided into two types:
If an object defines both __get__() and __set__ () method, this descriptor is called a data descriptor.
If an object only defines the __get__() method, this descriptor is called a non-data descriptor.
There are four situations when we access properties:
data descriptor
instance dict
non-data descriptor
__getattr__()
Their priority The size is:
data descriptor > instance dict > non-data descriptor > __getattr__()
What does this mean? That is to say, if data descriptor->d and instance attribute->d with the same name appear in the instance object obj, when obj.d accesses attribute d, Python will call it because the data descriptor has a higher priority. type(obj).__dict__['d'].__get__(obj, type(obj)) instead of calling obj.__dict__['d']. But if the descriptor is a non-data descriptor, Python will call obj.__dict__['d'].
Defining a descriptor class every time a descriptor is used seems very cumbersome. Python provides a concise way to add data descriptors to properties.
property(fget=None, fset=None, fdel=None, doc=None) -> property attribute
fget, fset and fdel are the getter, setter and deleter methods of the class respectively. We use the following example to illustrate how to use Property:
class Account(object): def __init__(self): self._acct_num = None def get_acct_num(self): return self._acct_num def set_acct_num(self, value): self._acct_num = value def del_acct_num(self): del self._acct_num acct_num = property(get_acct_num, set_acct_num, del_acct_num, '_acct_num property.')
If acct is an instance of Account, acct.acct_num will call the getter, acct.acct_num = value will call the setter, and del acct_num.acct_num will call deleter.
>>> acct = Account() >>> acct.acct_num = 1000 >>> acct.acct_num 1000
Python also provides the @property decorator, which can be used to create properties for simple application scenarios. A property object has getter, setter and delete decorator methods, which can be used to create a copy of the property through the accessor function of the corresponding decorated function.
class Account(object): def __init__(self): self._acct_num = None @property # the _acct_num property. the decorator creates a read-only property def acct_num(self): return self._acct_num @acct_num.setter # the _acct_num property setter makes the property writeable def set_acct_num(self, value): self._acct_num = value @acct_num.deleter def del_acct_num(self): del self._acct_num
If you want the property to be read-only, just remove the setter method.
We can add property attributes at runtime:
class Person(object): def addProperty(self, attribute): # create local setter and getter with a particular attribute name getter = lambda self: self._getProperty(attribute) setter = lambda self, value: self._setProperty(attribute, value) # construct property attribute and add it to the class setattr(self.__class__, attribute, property(fget=getter, \ fset=setter, \ doc="Auto-generated method")) def _setProperty(self, attribute, value): print("Setting: {} = {}".format(attribute, value)) setattr(self, '_' + attribute, value.title()) def _getProperty(self, attribute): print("Getting: {}".format(attribute)) return getattr(self, '_' + attribute)
>>> user = Person() >>> user.addProperty('name') >>> user.addProperty('phone') >>> user.name = 'john smith' Setting: name = john smith >>> user.phone = '12345' Setting: phone = 12345 >>> user.name Getting: name 'John Smith' >>> user.__dict__ {'_phone': '12345', '_name': 'John Smith'}
We can use descriptors To simulate the implementation of @staticmethod and @classmethod in Python. Let’s first browse the following table:
Called from an Object | Called from a Class | |
---|---|---|
f(obj, *args) | f(*args) | |
f(*args) | f(*args) | |
f(type(obj), *args) | f(klass, *args) |
Transformation | Called from an Object | Called from a Class |
---|---|---|
function | f(obj, *args) | f(*args) |
staticmethod | f(*args) | f(*args) |
classmethod | f(type(obj), *args) | f(klass, *args) |
对于静态方法f。c.f和C.f是等价的,都是直接查询object.__getattribute__(c, ‘f’)或者object.__getattribute__(C, ’f‘)。静态方法一个明显的特征就是没有self变量。
静态方法有什么用呢?假设有一个处理专门数据的容器类,它提供了一些方法来求平均数,中位数等统计数据方式,这些方法都是要依赖于相应的数据的。但是类中可能还有一些方法,并不依赖这些数据,这个时候我们可以将这些方法声明为静态方法,同时这也可以提高代码的可读性。
使用非数据描述符来模拟一下静态方法的实现:
class StaticMethod(object): def __init__(self, f): self.f = f def __get__(self, obj, objtype=None): return self.f
我们来应用一下:
class MyClass(object): @StaticMethod def get_x(x): return x print(MyClass.get_x(100)) # output: 100
Python的@classmethod和@staticmethod的用法有些类似,但是还是有些不同,当某些方法只需要得到类的引用而不关心类中的相应的数据的时候就需要使用classmethod了。
使用非数据描述符来模拟一下类方法的实现:
class ClassMethod(object): def __init__(self, f): self.f = f def __get__(self, obj, klass=None): if klass is None: klass = type(obj) def newfunc(*args): return self.f(klass, *args) return newfunc
首次接触Python魔术方法的时候,我也被__get__, __getattribute__, __getattr__, __getitem__之间的区别困扰到了,它们都是和属性访问相关的魔术方法,其中重写__getattr__,__getitem__来构造一个自己的集合类非常的常用,下面我们就通过一些例子来看一下它们的应用。
Python默认访问类/实例的某个属性都是通过__getattribute__来调用的,__getattribute__会被无条件调用,没有找到的话就会调用__getattr__。如果我们要定制某个类,通常情况下我们不应该重写__getattribute__,而是应该重写__getattr__,很少看见重写__getattribute__的情况。
从下面的输出可以看出,当一个属性通过__getattribute__无法找到的时候会调用__getattr__。
In [1]: class Test(object): ...: def __getattribute__(self, item): ...: print('call __getattribute__') ...: return super(Test, self).__getattribute__(item) ...: def __getattr__(self, item): ...: return 'call __getattr__' ...: In [2]: Test().a call __getattribute__ Out[2]: 'call __getattr__'
对于默认的字典,Python只支持以obj['foo']形式来访问,不支持obj.foo的形式,我们可以通过重写__getattr__让字典也支持obj['foo']的访问形式,这是一个非常经典常用的用法:
class Storage(dict): """ A Storage object is like a dictionary except `obj.foo` can be used in addition to `obj['foo']`. """ def __getattr__(self, key): try: return self[key] except KeyError as k: raise AttributeError(k) def __setattr__(self, key, value): self[key] = value def __delattr__(self, key): try: del self[key] except KeyError as k: raise AttributeError(k) def __repr__(self): return '<Storage ' + dict.__repr__(self) + '>'
我们来使用一下我们自定义的加强版字典:
>>> s = Storage(a=1) >>> s['a'] 1 >>> s.a 1 >>> s.a = 2 >>> s['a'] 2 >>> del s.a >>> s.a ... AttributeError: 'a'
getitem用于通过下标[]的形式来获取对象中的元素,下面我们通过重写__getitem__来实现一个自己的list。
class MyList(object): def __init__(self, *args): self.numbers = args def __getitem__(self, item): return self.numbers[item] my_list = MyList(1, 2, 3, 4, 6, 5, 3) print my_list[2]
这个实现非常的简陋,不支持slice和step等功能,请读者自行改进,这里我就不重复了。
下面是参考requests库中对于__getitem__的一个使用,我们定制了一个忽略属性大小写的字典类。
程序有些复杂,我稍微解释一下:由于这里比较简单,没有使用描述符的需求,所以使用了@property装饰器来代替,lower_keys的功能是将实例字典中的键全部转换成小写并且存储在字典self._lower_keys中。重写了__getitem__方法,以后我们访问某个属性首先会将键转换为小写的方式,然后并不会直接访问实例字典,而是会访问字典self._lower_keys去查找。赋值/删除操作的时候由于实例字典会进行变更,为了保持self._lower_keys和实例字典同步,首先清除self._lower_keys的内容,以后我们重新查找键的时候再调用__getitem__的时候会重新新建一个self._lower_keys。
class CaseInsensitiveDict(dict): @property def lower_keys(self): if not hasattr(self, '_lower_keys') or not self._lower_keys: self._lower_keys = dict((k.lower(), k) for k in self.keys()) return self._lower_keys def _clear_lower_keys(self): if hasattr(self, '_lower_keys'): self._lower_keys.clear() def __contains__(self, key): return key.lower() in self.lower_keys def __getitem__(self, key): if key in self: return dict.__getitem__(self, self.lower_keys[key.lower()]) def __setitem__(self, key, value): dict.__setitem__(self, key, value) self._clear_lower_keys() def __delitem__(self, key): dict.__delitem__(self, key) self._lower_keys.clear() def get(self, key, default=None): if key in self: return self[key] else: return default
我们来调用一下这个类:
>>> d = CaseInsensitiveDict() >>> d['ziwenxie'] = 'ziwenxie' >>> d['ZiWenXie'] = 'ZiWenXie' >>> print(d) {'ZiWenXie': 'ziwenxie', 'ziwenxie': 'ziwenxie'} >>> print(d['ziwenxie']) ziwenxie # d['ZiWenXie'] => d['ziwenxie'] >>> print(d['ZiWenXie']) ziwenxie
更多Python黑魔法之描述符相关文章请关注PHP中文网!