Abstract
In this article, I will directly discuss the use of Django's low-level ORM query method from the perspective of anti-patterns. As an alternative, we need to build query APIs related to specific fields in the model layer that contains business logic. These are not very easy to do in Django, but by deeply understanding the content principles of ORM, I will tell you some simplicity way to achieve this goal.
Overview
When writing Django applications, we are used to adding methods to models to encapsulate business logic and hide implementation details. This approach seems very natural, and is actually used in Django's built-in applications.
>>> from django.contrib.auth.models import User >>> user = User.objects.get(pk=5) >>> user.set_password('super-sekrit') >>> user.save()
The set_password here is a method defined in the django.contrib.auth.models.User model, which hides the specific implementation of hashing passwords. The corresponding code should look like this:
from django.contrib.auth.hashers import make_password class User(models.Model): # fields go here.. def set_password(self, raw_password): self.password = make_password(raw_password)
We are using Django to build a top-level general interface in a specific field, a low-level ORM tool. On this basis, increase the level of abstraction and reduce interactive code. The benefit of this is to make the code more readable, reusable and robust.
We have done this in a separate example and will use it in the example of getting database information below.
To describe this method, we use a simple app (todo list) to illustrate.
NOTE: This is an example. Because it is difficult to show a real example with a small amount of code. Don't worry too much about the todo list inheriting itself, but focus on how to make this method run.
The following is the models.py file:
from django.db import models PRIORITY_CHOICES = [(1, 'High'), (2, 'Low')] class Todo(models.Model): content = models.CharField(max_length=100) is_done = models.BooleanField(default=False) owner = models.ForeignKey('auth.User') priority = models.IntegerField(choices=PRIORITY_CHOICES, default=1
Imagine that we are going to pass this data and create a view to display incomplete, high-priority Todos for the current user. Here is the code:
def dashboard(request): todos = Todo.objects.filter( owner=request.user ).filter( is_done=False ).filter( priority=1 ) return render(request, 'todos/list.html', { 'todos': todos, })
Note: This can be written as request.user.todo_set.filter(is_done=False, priority=1). But this is just an experiment.
Why is it not good to write like this?
First of all, the code is verbose. It takes seven lines of code to complete, and in a formal project, it will be more complicated.
Secondly, leak implementation details. For example, is_done in the code is BooleanField. If its type is changed, the code will not work.
Then the intention is unclear and difficult to understand.
Finally, there will be duplication in use. Example: You need to write a line of command to send a todo list to all users every week through cron. In this case, you need to copy and paste seven lines of code. This is not in line with DRY (do not repeat yourself)
Let us make a bold guess: directly using low-level ORM code is an anti-pattern.
How to improve?
Using Managers and QuerySets
First, let’s understand the concepts.
Django has two closely related constructs related to table-level operations: managers and querysets
manager (an instance of django.db.models.manager.Manager) is described as "a plug-in provided to Django by querying the database" . Manager is the gateway to ORM for table-level functionality. Each model has a default manager called objects.
Quesyset (django.db.models.query.QuerySet) is "a collection of objects in the database". It is essentially a SELECT query. Filtering, sorting, etc. (filtered, ordered) can also be used to limit or modify the queried data. Used to create or manipulate django.db.models.sql.query.Query instances and then query in real SQL via the database backend.
Huh? You still don't understand?
As you slowly and deeply understand ORM, you will understand the difference between Manager and QuerySet.
People will be confused by the well-known Manager interface, because it is not what it seems.
Manager interface is a lie.
QuerySet methods are chainable. Each time a QuerySet method (such as: filter) is called, a copied queryset will be returned to wait for the next call. This is part of the fluid beauty of Django ORM.
But when Model.objects is a Manager, there is a problem. We need to call objects as a starting point and then link to the resulting QuerySet.
So how does Django solve it?
The lie of the interface is exposed here, all QuerySet methods are based on Manager. In this method, let us immediately return to the todo list through
to solve the problem of the query interface. The method recommended by Django is to customize the Manager subclass and add it to models.
self.get_query_set()的代理,重新创建一个QuerySet。 class Manager(object): # SNIP some housekeeping stuff.. def get_query_set(self): return QuerySet(self.model, using=self._db) def all(self): return self.get_query_set() def count(self): return self.get_query_set().count() def filter(self, *args, **kwargs): return self.get_query_set().filter(*args, **kwargs) # and so on for 100+ lines...
You can also add multiple managers to the model, or redefine objects, or maintain a single manager and add custom methods.
Let’s try these methods:
Method 1: Multiple managers
class IncompleteTodoManager(models.Manager): def get_query_set(self): return super(TodoManager, self).get_query_set().filter(is_done=False) class HighPriorityTodoManager(models.Manager): def get_query_set(self): return super(TodoManager, self).get_query_set().filter(priority=1) class Todo(models.Model): content = models.CharField(max_length=100) # other fields go here.. objects = models.Manager() # the default manager # attach our custom managers: incomplete = models.IncompleteTodoManager() high_priority = models.HighPriorityTodoManager()
This interface will be displayed in this way:
>>> Todo.incomplete.all() >>> Todo.high_priority.all()
This method There are several questions.
First, this implementation method is rather cumbersome. You need to define a class for each query custom function.
Second, this will mess up your namespace. Django developers regard Model.objects as the entry point to the table. Doing so would break naming conventions.
第三,不可链接的。这样做不能将managers组合在一起,获得不完整,高优先级的todos,还是回到低等级的ORM代码:Todo.incomplete.filter(priority=1) 或Todo.high_priority.filter(is_done=False)
综上,使用多managers的方法,不是最优选择。
方法2: Manager 方法
现在,我们试下其他Django允许的方法:在单个自定义Manager中的多个方法
class TodoManager(models.Manager): def incomplete(self): return self.filter(is_done=False) def high_priority(self): return self.filter(priority=1) class Todo(models.Model): content = models.CharField(max_length=100) # other fields go here.. objects = TodoManager()
我们的API 现在看起来是这样:
>>> Todo.objects.incomplete() >>> Todo.objects.high_priority()
这个方法显然更好。它没有太多累赘(只有一个Manager类)并且这种查询方法很好地在对象后预留命名空间。(译注:可以很形象、方便地添加更多的方法)
不过这还不够全面。 Todo.objects.incomplete() 返回一个普通查询,但我们无法使用 Todo.objects.incomplete().high_priority() 。我们卡在 Todo.objects.incomplete().filter(is_done=False),没有使用。
方法3:自定义QuerySet
现在我们已进入Django尚未开放的领域,Django文档中找不到这些内容。。。
class TodoQuerySet(models.query.QuerySet): def incomplete(self): return self.filter(is_done=False) def high_priority(self): return self.filter(priority=1) class TodoManager(models.Manager): def get_query_set(self): return TodoQuerySet(self.model, using=self._db) class Todo(models.Model): content = models.CharField(max_length=100) # other fields go here.. objects = TodoManager()
我们从以下调用的视图代码中可以看出端倪:
>>> Todo.objects.get_query_set().incomplete() >>> Todo.objects.get_query_set().high_priority() >>> # (or) >>> Todo.objects.all().incomplete()
差不多完成了!这并有比第2个方法多多少累赘,得到方法2同样的好处,和额外的效果(来点鼓声吧...),它终于可链式查询了!
>>> Todo.objects.all().incomplete().high_priority()
然而它还不够完美。这个自定义的Manager仅仅是一个样板而已,而且 all() 还有瑕疵,在使用时不好把握,而更重要的是不兼容,它让我们的代码看起来有点怪异。
>>> Todo.objects.all().high_priority()
方法3a:复制Django,代理做所有事
现在我们让以上”假冒Manager API“讨论变得有用:我们知道如何解决这个问题。我们简单地在Manager中重新定义所有QuerySet方法,然后代理它们返回我们自定义QuerySet:
class TodoQuerySet(models.query.QuerySet): def incomplete(self): return self.filter(is_done=False) def high_priority(self): return self.filter(priority=1) class TodoManager(models.Manager): def get_query_set(self): return TodoQuerySet(self.model, using=self._db) def incomplete(self): return self.get_query_set().incomplete() def high_priority(self): return self.get_query_set().high_priority()
这个能更好地提供我们想要的API:
Todo.objects.incomplete().high_priority() # yay!
除上面那些输入部分、且非常不DRY,每次你新增一个文件到QuerySet,或是更改现有的方法标记,你必须记住在你的Manager中做相同的更改,否则它可能不会正常工作。这是配置的问题。
方法3b: django-model-utils
Python 是一种动态语言。 我们就一定能避免所有模块?一个名叫Django-model-utils的第三方应用带来的一点小忙,就会有点不受控制了。先运行 pip install django-model-utils ,然后……
from model_utils.managers import PassThroughManager class TodoQuerySet(models.query.QuerySet): def incomplete(self): return self.filter(is_done=False) def high_priority(self): return self.filter(priority=1) class Todo(models.Model): content = models.CharField(max_length=100) # other fields go here.. objects = PassThroughManager.for_queryset_class(TodoQuerySet)()
这要好多了。我们只是象之前一样 简单地定义了自定义QuerySet子类,然后通过django-model-utils提供的PassThroughManager类附加这些QuerySet到我们的model中。
PassThroughManager 是由__getattr__ 实现的,它能阻止访问到django定义的“不存在的方法”,并且自动代理它们到QuerySet。这里需要小心一点,检查确认我们没有在一些特性中没有无限递归(这是我为什么推荐使用django-model-utils所提供的用不断尝试测试的方法,而不是自己手工重复写)。
做这些有什么帮助?
记得上面早些定义的视图代码么?
def dashboard(request): todos = Todo.objects.filter( owner=request.user ).filter( is_done=False ).filter( priority=1 ) return render(request, 'todos/list.html', { 'todos': todos, })
加点小改动,我们让它看起来象这样:
def dashboard(request): todos = Todo.objects.for_user( request.user ).incomplete().high_priority() return render(request, 'todos/list.html', { 'todos': todos, })
希望你也能同意第二个版本比第一个更简便,清晰并且更有可读性。
Django能帮忙么?
让这整个事情更容易的方法,已经在django开发邮件列表中讨论过,并且得到一个相关票据(译注:associated ticket叫啥名更好?)。Zachary Voase则建议如下:
class TodoManager(models.Manager): @models.querymethod def incomplete(query): return query.filter(is_done=False)
通过这个简单的装饰方法的定义,让Manager和QuerySet都能使不可用的方法神奇地变为可用。
我个人并不完全赞同使用基于装饰方法。它略过了详细的信息,感觉有点“嘻哈”。我感觉好的方法,增加一个QuerSet子类(而不是Manager子类)是更好,更简单的途径。
或者我们更进一步思考。退回到在争议中重新审视Django的API设计决定时,也许我们能得到真实更深的改进。能不再争吵Managers和QuerySet的区别吗(至少澄清一下)?
我很确信,不管以前是否曾经有过这么大的重构工作,这个功能必然要在Django 2.0 甚至更后的版本中。
因此,简单概括一下:
在视图和其他高级应用中使用源生的ORM查询代码不是很好的主意。而是用django-model-utils中的PassThroughManager将我们新加的自定义QuerySet API加进你的模型中,这能给你以下好处:
啰嗦代码少,并且更健壮。
增加DRY,增强抽象级别。
Push the corresponding business logic to the corresponding domain model layer.