Celery Best Practices)

Source: Internet
Author: User
Tags rabbitmq

Original article: http://my.oschina.net/siddontang/blog/284107

Directory [-]

  • 1. Do not use the database as your amqp Broker
  • 2. Use more Queue (do not use the default value only)
  • 3. Use workers with priority
  • 4. Use the celery Error Handling Mechanism
  • 5. Use flower
  • 6. Don't worry too much about the task exit status.
  • 7. Do not pass the database/ORM object to the task.
  • Last

As a heavy-duty celery user, the article celery best practices couldn't help but get stuck. Simply translate it and add the celery practical experience in our project.

When using Django, you may need to execute some background tasks for a long time. Maybe you may need to use some sorted task queues, so celery will be a good choice.

After using celery as a task queue for many projects, the author has accumulated some best practices, such as how to use celery in an appropriate way, and some features provided by celery but not fully used.

1. Do not use the database as your amqp Broker

The database is not designed to be used by amqp broker. in the production environment, it is likely to be a machine (Ps, I don't think any system can be used properly !!!).

The author guessed why many people use databases as brokers mainly because they already have a database to provide data storage for Web apps, so they simply use it directly, it is easy to set the broker to celery without installing other components (such as rabbitmq ).

Assume that you have four backend workers to obtain and process the tasks placed in the database. This means that you have four processes to obtain the latest tasks, you need to poll the database frequently. Maybe each worker has multiple concurrent threads at the same time.

One day, you find that because too many tasks are generated and four workers are not enough, the processing speed of the tasks is greatly behind the production task speed, so you keep increasing the number of workers. Suddenly, your database is slow to respond due to a large number of process polling tasks, disk Io is always in the peak status, and your web applications are also affected. All of this is because workers constantly performs DDoS attacks on the database.

When you use a suitable amqp (such as rabbitmq), this will not happen. Taking rabbitmq as an example, it first places the task queue in the memory, you do not need to access the hard disk. Second, the consumers (that is, the preceding worker) does not need to be polling frequently because rabbitmq can push new tasks to the consumers. Of course, if rabbitmq encounters a problem, at least it will not affect your web application.

This is why the author said that the database is not used as a broker. In addition, compiled rabbitmq images are provided in many places and can be directly used, for example.

I agree with this. Our system uses celery to process asynchronous tasks in a large number, with an average of millions of asynchronous tasks per day. MySQL we used previously, and there will always be a problem that the task processing delay is too serious, it does not work even if worker is added. So we used redis, and the performance was improved a lot. As for the slow usage of MySQL, we did not go into details, and we may have encountered a DDoS problem.

2. Use more Queue (do not use the default value only)

Celery is very easy to set. Generally, it uses the default queue to store tasks (unless you specify other Queue ). The statement is as follows:

@app.task()def my_taskA(a, b, c):    print("doing something here...")@app.task()def my_taskB(x, y):    print("doing something here...")

Both tasks are executed in the same queue, which is very attractive because you only need to use a decorator to implement an asynchronous task. I am concerned that TASKA and taskb may be totally different things, or one may be more important than the other. Why should we put them in one basket? (No eggs can be placed in a basket, right !) Maybe task KB is not very important, but the amount is too large, so that the important task Ka cannot be processed quickly by the worker. Adding workers cannot solve this problem, because TASKA and taskb are still executed in a queue.

3. Use workers with priority

To solve the problem in 2, we need to make TASKA in one queue Q1, while taskb is executed in another queue Q2. SpecifyXWorkers processes the task of queue Q1, and then uses other workers to process the task of queue Q2. In this way, taskb can obtain enough workers for processing, and some priority workers can well process TASKA without waiting for a long time.

First, manually define the queue

CELERY_QUEUES = (    Queue(‘default‘, Exchange(‘default‘), routing_key=‘default‘),    Queue(‘for_task_A‘, Exchange(‘for_task_A‘), routing_key=‘for_task_A‘),    Queue(‘for_task_B‘, Exchange(‘for_task_B‘), routing_key=‘for_task_B‘),)

Then define routes to determine which queue the task is going

CELERY_ROUTES = {    ‘my_taskA‘: {‘queue‘: ‘for_task_A‘, ‘routing_key‘: ‘for_task_A‘},    ‘my_taskB‘: {‘queue‘: ‘for_task_B‘, ‘routing_key‘: ‘for_task_B‘},}

Start different workers for each task.celery worker -E -l INFO -n workerA -Q for_task_A celery worker -E -l INFO -n workerB -Q for_task_B

In our project, there will be a large number of file conversion problems, there are a large number of file conversion less than 1 MB, there are also a small number of file conversion of nearly 20 mb, the priority of small file conversion is the highest, at the same time, it does not take a lot of time, but it takes a lot of time to convert large files. If you put the conversion task in a queue, it is very likely that the conversion delay of small files may be caused by the time consumption of large files.

Therefore, we set three priority queues according to the file size, and set different workers for each queue, which effectively solves the problem of file conversion.

4. Use the celery Error Handling Mechanism

Most tasks do not use error handling. If the task fails, the task fails. In some cases, this is good, but most of the failed tasks I have seen are to call third-party APIs and then encounter network errors or resource unavailability errors. For these errors, the simplest way is to try again. It may be a problem with the temporary service or network of a third-party API, but it may be okay right away. Why not try again?

@app.task(bind=True, default_retry_delay=300, max_retries=5)def my_task_A():    try:        print("doing stuff here...")    except SomeNetworkException as e:        print("maybe do some clenup here....")        self.retry(e)

 

The author prefers to define the maximum retry time and the maximum number of Retries for each task. Of course, there are more detailed parameter settings. Read the documentation for yourself.

For error handling, we use special scenarios. For example, if a file fails to be converted, the retry will fail no matter how many retries, so the Retry Mechanism is not added.

5. Use flower

Flower is a very powerful tool used to monitor celery tasks and works.

We didn't use this because most of the time we directly connect to redis to check the celery information. It seems silly. No, especially the data that celery stores in redis cannot be retrieved conveniently.

6. Don't worry too much about the task exit status.

The status of a task indicates whether the task is successful or failed at the end of the task. It may be useful in some statistical scenarios. However, we need to know that the exit status of the task is not the result of the task execution. Some results of the task execution will affect the program, it is usually written into the database (for example, updating a user's friend list ).

Most of the projects I have seen have stored the state of task termination in SQLite or your own database. But is it really necessary to save the state? Maybe it may affect your web service, so I usually setCELERY_IGNORE_RESULT = TrueDiscard.

For us, because it is an asynchronous task, we know that the status after the task is completed is useless, so we discard it decisively.

7. Do not pass the database/ORM object to the task.

In fact, this is not to pass the database object (such as a user's instance) to the task, because the serialized data may have expired. Therefore, it is best to pass a user ID directly and obtain it from the database in real time during task execution.

This is also true for the task. Only ID data is transmitted to the task. For example, when a file is converted, only the file ID is transmitted, other file information is obtained directly from the database using this ID.

Last

We have our own feelings. The use of celery mentioned above can be a good practice. At least we haven't had a big problem with celery, of course there are still some traps. As for rabbitmq, we have never used this. I don't know how it works. It's better than MySQL at least.

Finally, we will attach the author's Celery talk https://denibertovic.com/talks/celery-best-practices /.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.