使用自定义 Django 命令自动重新加载 Celery 工作线程

WBOY
发布: 2024-07-22 09:40:11
原创
1003 人浏览过

Automatically reload Celery workers with a custom Django command

Celery previously had an --autoreload flag that has since been removed. However, Django has automatic reloading built into its manage.py runserver command. The absence of automatic reloading in Celery workers creates a confusing development experience: updating Python code causes the Django server to reload with the current code, but any tasks that the server fires will run stale code in the Celery worker.

This post will show you how to build a custom manage.py runworker command that automatically reloads Celery workers during development. The command will be modeled after runserver, and we will take a look at how Django's automatic reloading works under-the-hood.

Before we begin

This post assumes that you have a Django app with Celery already installed (guide). It also assumes you have an understanding of the differences between projects and applications in Django.

All links to source code and documentation will be for current versions of Django and Celery at the time of publication (July, 2024). If you're reading this in the distant future, things may have changed.

Finally, the main project directory will be named my_project in the post's examples.

Solution: a custom command

We will create a custom manage.py command called runworker. Because Django provides automatic reloading via its runsever command, we will use runserver's source code as the basis of our custom command.

You can create a command in Django by making a management/commands/ directory within any of your project's applications. Once the directories have been created, you may then put a Python file with the name of the command you'd like to create within that directory (docs).

Assuming your project has an application named polls, we will create a file at polls/management/commands/runworker.py and add the following code:

# polls/management/commands/runworker.py

import sys
from datetime import datetime

from celery.signals import worker_init

from django.conf import settings
from django.core.management.base import BaseCommand
from django.utils import autoreload

from my_project.celery import app as celery_app


class Command(BaseCommand):
    help = "Starts a Celery worker instance with auto-reloading for development."

    # Validation is called explicitly each time the worker instance is reloaded.
    requires_system_checks = []
    suppressed_base_arguments = {"--verbosity", "--traceback"}

    def add_arguments(self, parser):
        parser.add_argument(
            "--skip-checks",
            action="store_true",
            help="Skip system checks.",
        )
        parser.add_argument(
            "--loglevel",
            choices=("DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL", "FATAL"),
            type=str.upper,  # Transforms user input to uppercase.
            default="INFO",
        )

    def handle(self, *args, **options):
        autoreload.run_with_reloader(self.run_worker, **options)

    def run_worker(self, **options):
        # If an exception was silenced in ManagementUtility.execute in order
        # to be raised in the child process, raise it now.
        autoreload.raise_last_exception()

        if not options["skip_checks"]:
            self.stdout.write("Performing system checks...\n\n")
            self.check(display_num_errors=True)

        # Need to check migrations here, so can't use the
        # requires_migrations_check attribute.
        self.check_migrations()

        # Print Django info to console when the worker initializes.
        worker_init.connect(self.on_worker_init)

        # Start the Celery worker.
        celery_app.worker_main(
            [
                "--app",
                "my_project",
                "--skip-checks",
                "worker",
                "--loglevel",
                options["loglevel"],
            ]
        )

    def on_worker_init(self, sender, **kwargs):
        quit_command = "CTRL-BREAK" if sys.platform == "win32" else "CONTROL-C"

        now = datetime.now().strftime("%B %d, %Y - %X")
        version = self.get_version()
        print(
            f"{now}\n"
            f"Django version {version}, using settings {settings.SETTINGS_MODULE!r}\n"
            f"Quit the worker instance with {quit_command}.",
            file=self.stdout,
        )
登录后复制

IMPORTANT: Be sure to replace all instances of my_project with the name of your Django project.

If you want to copy-and-paste this code and continue with your programming, you can safely stop here without reading the rest of this post. This is an elegant solution that will serve you well as you develop your Django & Celery project. However, if you want to learn more about how it works then keep reading.

How it works (optional)

Rather than review this code line-by-line, I'll discuss the most interesting parts by topic. If you aren't already familiar with Django custom commands, you may want to review the docs before proceeding.

Automatic reloading

This part feels the most magical. Within the body of the command's handle() method, there is a call to Django's internal autoreload.run_with_reloader(). It accepts a callback function that will execute every time a Python file is changed in the project. How does that actually work?

Let's take a look at a simplified version of the autoreload.run_with_reloader() function's source code. The simplified function rewrites, inlines, and deletes code to provide clarity about its operation.

# NOTE: This has been dramatically pared down for clarity.

def run_with_reloader(callback_func, *args, **kwargs):
    # NOTE: This will evaluate to False the first time it is run.
    is_inside_subprocess = os.getenv("RUN_MAIN") == "true"

    if is_inside_subprocess:
        # The reloader watches for Python file changes.
        reloader = get_reloader()

        django_main_thread = threading.Thread(
            target=callback_func, args=args, kwargs=kwargs
        )
        django_main_thread.daemon = True
        django_main_thread.start()

        # When the code changes, the reloader exits with return code 3.
        reloader.run(django_main_thread)

    else:
        # Returns Python path and the arguments passed to the command.
        # Example output: ['/path/to/python', './manage.py', 'runworker']
        args = get_child_arguments()

        subprocess_env = {**os.environ, "RUN_MAIN": "true"}
        while True:
            # Rerun the manage.py command in a subprocess.
            p = subprocess.run(args, env=subprocess_env, close_fds=False)
            if p.returncode != 3:
                sys.exit(p.returncode)
登录后复制

When manage.py runworker is run in the command line, it will first call the handle() method which will call run_with_reloader().

Inside run_with_reloader(), it will check to see if an environment variable called RUN_MAIN has a value of "true". When the function is first called, RUN_MAIN should have no value.

When RUN_MAIN is not set to "true", run_with_reloader() will enter a loop. Inside the loop, it will start a subprocess that reruns the manage.py [command_name] that was passed in, then wait for that subprocess to exit. If the subprocess exits with return code 3, the next iteration of the loop will start a new subprocess and wait. The loop will run until a subprocess returns an exit code that is not 3 (or until the user exits with ctrl + c). Once it gets a non-3 return code, it will exit the program completely.

The spawned subprocess runs the manage.py command again (in our case manage.py runworker), and again the command will call run_with_reloader(). This time, RUN_MAIN will be set to "true" because the command is running in a subprocess.

Now that run_with_reloader() knows it is in a subprocess, it will get a reloader that watches for file changes, put the provided callback function in a thread, and pass it to the reloader which begins watching for changes.

When a reloader detects a file change, it runs sys.exit(3). This exits the subprocess, which triggers the next iteration of the loop from the code that spawned the subprocess. In turn, a new subprocess is launched that uses an updated version of the code.

系统检查和迁移

默认情况下,Django 命令在运行其handle() 方法之前执行系统检查。但是,对于 runserver 和我们的自定义 runworker 命令,我们希望推迟运行这些命令,直到进入我们提供给 run_with_reloader() 的回调中。在我们的例子中,这是我们的 run_worker() 方法。这使我们能够运行自动重新加载的命令,同时修复损坏的系统检查。

为了推迟运行系统检查,需要将requires_system_checks属性的值设置为空列表,并通过在run_worker()主体中调用self.check()来执行检查。与 runserver 一样,我们的自定义 runworker 命令也会检查所有迁移是否已运行,如果有待处理的迁移,它会显示警告。

因为我们已经在 run_worker() 方法中执行 Django 的系统检查,所以我们通过向 Celery 传递 --skip-checks 标志来禁用系统检查,以防止重复工作。

所有与系统检查和迁移相关的代码都是直接从 runserver 命令源代码中提取的。

celery_app.worker_main()

我们的实现使用 celery_app.worker_main() 直接从 Python 启动 Celery Worker,而不是向 Celery 发起攻击。

on_worker_init()

此代码在工作进程初始化时执行,显示日期和时间、Django 版本以及退出命令。它是根据 runserver 启动时显示的信息建模的。

其他 runserver 样板

以下行也从 runserver 源代码中删除:

  • suppressed_base_arguments = {"--verbosity", "--traceback"}
  • autoreload.raise_last_exception()

日志级别

我们的自定义命令具有可配置的日志级别,以防开发人员希望在不修改代码的情况下从 CLI 调整设置。

更进一步

我研究了 Django 和 Celery 的源代码来构建这个实现,并且有很多扩展它的机会。您可以配置该命令以接受更多 Celery 的工作参数。或者,您可以创建一个自定义的 manage.py 命令,它会自动重新加载任何 shell 命令,就像 David Browne 在本要点中所做的那样。

如果您觉得本文有用,请随时留下点赞或评论。感谢您的阅读。

以上是使用自定义 Django 命令自动重新加载 Celery 工作线程的详细内容。更多信息请关注PHP中文网其他相关文章!

来源:dev.to
本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责声明 Sitemap
PHP中文网:公益在线PHP培训,帮助PHP学习者快速成长!