Using celery to understand the celery

Source: Internet
Author: User

Original http://www.dongwm.com/archives/shi-yong-celeryzhi-liao-jie-celery/Preface

I think a lot of doing development and operations will involve one thing: crontab, that is, set a scheduled task on the server, perform some tasks on schedule. But if you have thousands of servers, you have thousands of missions, Then the management of the scheduled task is probably a very headache. Even if you're just dozens of different machines on the assignment, how do you manage and implement the following functions properly:

    1. View the execution of a timed task, such as whether the execution is successful, the current state, and the time spent executing.
    2. A friendly interface or command line implementation to add, delete tasks
    3. How to simply implement different machines to set different kinds of tasks, some machines to perform different queues
    4. If you need to generate a task how do not block the rest of the process (asynchronous)
    5. How to perform tasks concurrently
Several options
    1. Rich people have time to achieve a set of their own, the advantages are fully in line with the needs of the company's business, have dedicated team maintenance and service
    2. Using Gearman, I've heard of no use, because it's c/java/perl, for us Python developers or OPS, if they don't have this experience, they don't have the ability to understand the underlying implementation and two development.
    3. Using RQ, RQ is the author of the Gitflow, and the introduction is very clear: simple job queues for Python. Afraid it's not complicated enough, but if the business is not so complicated or the application is not so strict, you can try
    4. Okay, I chose the celery, now used for nearly half a year, may be a legacy issue, the version is lower. There are many pits. But it's good.
Message Queuing

Rabbitmq,zeromq such a message queue always appears in our eyes, in fact, the meaning is very simple: The message is a data to be transferred, celery is a distributed task queue. This "task" is actually a message, the task is generated in the queue, Received and stored by containers such as RABBITMQ, and then taken away by the machine to be executed at the appropriate time.

Celery tasks can use Rabbitmq/redis/sqlalchemy/django's orm/mongodb and so on containers (this is called broker). I'm using RABBITMQ, Because the author wrote this clearly in the introduction to the GitHub homepage.

The so-called queue, you can imagine a problem, I have a big push of things to execute, but I do not need each server to perform this task, because the business is different. So make a queue, such as the task A,b,c a can be executed on the X, Y server, but it is not required or cannot be executed on the z server. Then in x, y you start the worker (which is actually consumers and producers) plus this queue, the z server does not need to specify this queue, It will not perform the task of this queue.

Celery's principle, my angle here is django+celery+django-celery.

First, let's talk about the process:

    1. Add tasks using Django-celery or directly manipulate the database (specified inside the settings.py), and set the relevant properties (with the interval of scheduled tasks) into the database.
    2. Celerybeat gets a periodic check of the tasks you set in Django (default 5s) by Djcelery.schedulers.DatabaseScheduler, discovering that you need to perform a task that says it throws in the broker you set ( I'm RABBITMQ here), and he'll have more settings.py settings put in the corresponding queue
    3. On the server where you started the Celery worker (formerly Celeryd), the worker also periodically (the default 5s) goes to the broker to find out if there are any tasks in the queue that it needs to execute.
    4. When the discovery queue has a task to perform, the worker takes it out to execute, and the result is executed by Celerycam (the default 30s, so the process also starts) to the Django settings database, updating the status of the task. Like the time spent
Supervisor Process Management

I don't know if anyone used supervise, I used to monitor my program often in the initial project development, when the program died automatically started, supervisor is a process management tool, I use it here to manage celery programs and UWSGI

Paste the configuration of one of my local environments and proceed directly to the description:

; Program Name[Program:celery-queue-fetch], the program to execute the command,-q specifies the generation and acceptance of the task queue, multiple uses are good separate-C for the number of WORKR, intended to be the number of concurrentcommand=python/home/dongwm/work/manage.py Celery WORKER-E--settings=settings_local--loglevel=info-q Fetch_-C 30In the directory where the program was executed.directory=/home/dongwm/work/; The user that the execution program usesUser=dongwm; The number of instances of the program started, default is 1Numprocs=1Stdout_logfile=/home/dongwm/work/celerylog/celery.logStderr_logfile=/home/dongwm/work/celerylog/celery.logStart automatically when supervisor is started.Autostart=True;Automatically restarts when the program may not start successfully for some reasonAutorestart=True;When I start waiting, I think it's time to reboot to kill the original process reservation.startsecs=10The process sends a stop signal waiting for the OS to return Sigchild timestopwaitsecs=10The low priority will first start the last shutdownpriority=998The following 2 sentences are intended to ensure that the process and its child processes are killed and the child processes are left orphaned because they do not kill only the master process of the program they control.Stopsignal=killStopasgroup=true[Program:celery-queue-feed]command=python/home/dongwm/work/manage.py CELERYD-E--settings=settings_local--loglevel=info-q Feeddirectory=/home/dongwm/work/User=dongwmNumprocs=1Stdout_logfile=/home/dongwm/work/celerylog/celery.logStderr_logfile=/home/dongwm/work/celerylog/celery.logAutostart=Trueautorestart=Truestartsecs=10stopwaitsecs=10priority=998Stopsignal=killstopasgroup=True[Program:celerycam];Task snapshot time interval is 10scommand=python/home/dongwm/work/manage.py Celerycam-f 10--settings=settings_localdirectory=/home/dongwm/work/User=dongwmNumprocs=1Stdout_logfile=/home/dongwm/work/celerylog/celerycam.logStderr_logfile=/home/dongwm/work/celerylog/celerycam.logAutostart=Trueautorestart=Truestartsecs=5stopwaitsecs=5priority=998Stopsignal=killstopasgroup=True[Program:celerybeat]command=python/home/dongwm/work/manage.py celerybeat--settings=settings_real_old--loglevel=debugdirectory=/home/dongwm/work/User=dongwmnumprocs=1stdout_logfile=/home/dongwm/Work/celerylog/celery_beat.logstderr_logfile=/home/dongwm/Work/celerylog/celery_beat.logautostart=Trueautorestart=Truestartsecs=10priority=999Stopsignal=killstopasgroup=True;This is the official supervisor of a monitoring process status abnormal exit script, I made a large change to it, so that the program will be strange to send me an email[Eventlistener:crashmail]Command=python/home/dongwm/superlance/superlance/crashmail.py-a-M [email protected]events=process_state_exited[Program:uwsgi]user = Dongwmnumprocs=1command=/usr/local/bin/ Uwsgi-s/tmp/uwsgi-sandbox.sock --processes 4--enable-threads  --pythonpath/home/dongwm/uwsgi-- Buffer-size 32768--listen--daemonize/home/dongwm/ulog/uwsgi_out.logdirectory=/home/dongwm/workAuto start=trueautorestart=trueredirect_stderr=truestopsignal=kill  Stopasgroup=true               
The practice of Nginx+uwsgi

Nginx process number will directly affect the performance, how to use the module does not have a blocking call, how many CPUs should be equipped with how many worker_processes, otherwise you need to configure more process number. For example, if your users read a lot of your local static files, and the server has less memory, I/O calls to the hard disk may block the worker for a small amount of time.

In order to take advantage of multicore, I bind the worker and the corresponding kernel:

Worker_processes     4;  worker_cpu_affinity 0001 0010 0100;

Using celery to understand celery (turn)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.