Distributed task Queue celery--detailed workflow

Source: Internet
Author: User
Tags zip in python
Catalogue

Table of Contents Preamble Task Signature Signature Partial function callback function Celery Workflow Group Task group chain task chain chord composite Task chunks Task block Mapstarmap task Map

List of previous texts

Distributed task queue celery Preface

Celery's workflow has a very strong functional programming style, and before we understand the workflow, we need to know something about "signature", "partial function" and "callback function".

The example code used in this article is immediately preceded by a small number of modifications to the taks.py module.

# filename:tasks.py from
proj.celery import app

@app. Task
def Add (x, Y, Debug=false):
    if debug:
        Print ("x:%s; Y:%s "% (x, y)"
    return x + y

@app. Task
def log (msg):
    return "log:%s"% msg
Task Signature Signature

Using the celery Signature signature (Subtask subtask), a special object-the task signature-can be generated. A task signature is similar to a function signature, in addition to the General declaration information (formal parameter, return value) that contains the function, including the calling convention and (all/part) argument lists that are required to execute the function. You can use the signature directly anywhere, even if you don't need to consider the problem with the arguments passed (the arguments may have been included when the signature was generated). It can be seen that this feature of task signatures makes it easy to combine, nest, and provision different tasks.

The task signature supports "direct execution" and "worker execution" in two ways: generate a task signature and execute it directly : The signature executes in the current process

>>> from celery import signature
>>> from proj.task.tasks Import add

# Way One
>>> Signature (' Proj.task.tasks.add ', args= (2, 3), countdown=10)
Proj.task.tasks.add (2, 3)
>>> S_add = Signature (' Proj.task.tasks.add ', args= (2, 3), countdown=10)
>>> s_add ()
5

# Way two
>> > Add.signature ((3, 4), countdown=10)
Proj.task.tasks.add (3, 4)
>>> S_add = Add.signature ((3, 4) , countdown=10)
>>> s_add ()
7

# Way three
>>> Add.subtask ((3, 4), countdown=10)
Proj.task.tasks.add (3, 4)
>>> S_add = Add.subtask ((3, 4), countdown=10)
>>> S_add ()
7

# Way Four
>>> Add.s (3, 4)
Proj.task.tasks.add (3, 4)
>>> S_add = Add.s (3, 4) C25/>>>> S_add ()
7
generate a task signature and leave it to the Worker to execute: The signature is executed in the Worker service process
# Call the Delay/apply_async method to load the signature into the Worker execution
>>> s_add = Add.s (2, 2)
>>> s_add.delay ()       
< Asyncresult:75be3776-b36b-458e-9a89-512121cdaa32>
>>> s_add.apply_async ()
<asyncresult: 4f1bf824-331a-42c0-9580-48b5a45c2f7a>


>>> s_add = Add.s (2, 2)
>>> S_add.delay (debug= True)     # Task signature supports dynamic pass arguments
<AsyncResult:1a1c97c5-8e81-4871-bb8d-def39eb539fc>
>>> s_ Add.apply_async (kwargs={' Debug ': True})
<AsyncResult:36d10f10-3e6f-46c4-9dde-d2eabb24c61c>

Partial function

partial function APPLICATION,PFA: converts a function with an arbitrary shape parameter quantity (order) into another new function that already contains any argument. In simple terms, it is the return of a new function object and the curing of some of the parameters in the function as actual arguments. The default parameter characteristics of similar functions are seen from a certain point of view, but not all. Because the default parameter list is persistent, and in PFA, the cured parameter list can be defined arbitrarily. For the same function, you can get many partial functions about it, and the list of cured parameters for each partial function can be different. e.g.

# Normal partial function:
>>> from functools import partial
>>> add_1 = partial (add, 1)
>>> add_1 ( 2)
3
# add_1 (x) = = Add (1, x)

>>> int_base2 = partial (int, base=2)
>>> Int_base2.__doc __ = ' Convert base 2 string to an int. '
>>> int_base2 (' 10010 ')
int_base2 (x) = = Int (x, base=2)

The partial function in celery is actually a task signature that solidifies some of the parameters. e.g.

# celery Partial functions
>>> Add.s (1)
Proj.task.tasks.add (1)
>>> s_add_1 = Add.s (1)
> >> S_add_1 (Ten)
>>> s_add_1.delay (
<asyncresult): Eb88ad9c-31f6-484f-8fd5-735a498aedbc>
callback function

A callback function is a function that is called through a function pointer (the name of a functor). If you pass the function pointer as an argument to another function, when the pointer is used to invoke the function it points to, we call it a callback. The callback function responds to the event or condition by being called back when a particular event occurs or when a specified condition is met.

The callback function in celery is still a task signature, and the event or condition that triggers the callback is "task execution succeeded" and "task execution failed."
Call the Link/link_error of the Apply_async method to specify the callback function. e.g. task execution Success callback:

# Default, the arguments for the callback function come from the execution result of the previous task
>>> from proj.task.tasks Import Add, log
>>> result = Add.apply_ Async (args= (1, 2), Link=log.s ())
>>> result.get ()
3

Task execution Failure callback:

>>> result = Add.apply_async (args= (1, 2), Link_error=log.s ())
>>> result.status
u ' SUCCESS '
>>> result.get ()
3

If you want the arguments for the callback function to not come from the result of the previous task, you can set the arguments for the callback function to immutable (immutable):

>>> Add.apply_async ((2, 2), Link=log.signature (args= (' Task SUCCESS ',), immutable=true))
< asyncresult:c136ad34-68b4-49a9-8462-84ac8cd75810>
# simple notation
>>> add.apply_async (2, 2), link= Log.si (' Task SUCCESS '))
<AsyncResult:bbb35212-5a6b-427b-a6a6-d1eb5359365e>

Of course, the callback function and the partial function can be used in combination for better flexibility:

>>> result = Add.apply_async ((2, 2), Link=add.s (2))

Note : It should be noted that the result of the callback function is not returned, so using Result.get can only get the results of the first task.

>>> result = Add.apply_async ((2, 2), Link=add.s (2))
>>> result.get ()
4
Celery Workflow Group task Groups

The task group function receives a list of task signatures, returns a new task signature-a signature group, and invokes a signature group that executes all the task signatures it contains in parallel and returns a list of all results. Often used to create multiple tasks at once.

>>> from celery Import group
>>> from proj.task.tasks import add
>>> Add_group_sig = g Roup (Add.s (i, I) for I in range)
>>> result = Add_group_sig.delay ()
>>> result.get ()
[0, 2, 4, 6, 8, ten
,. # returns multiple results
>>> result.results
[<asyncresult:1716cfd0-e87c-4b3d-a79f-1112958111b1>, 
 < Asyncresult:a7a18bde-726e-49b2-88ed-aeba5d3bf5f2>, 
 <asyncresult:b9d9c538-2fad-475a-b3d1-bd1488278ce2 , 
 <asyncresult:6f370fdd-ed7e-430a-a335-af4650ca15cf>, 
 <asyncresult: A6ddbe14-5fbd-4079-9f12-35ebbc89d89b>, 
 <asyncresult:65dece11-9f38-4940-9fa0-7fcf09266c7a>, 
 <asyncresult:8205ffc0-1056-469a-a642-96676d1518e7>, 
 <asyncresult: E77b7e2b-66d2-48b8-9ffd-4f8fa7d9f4a4>, 
 <asyncresult:355b7d01-72c1-4b00-8572-407e751d76c3>, 
 

(Parallel execution)
Chain Task Chain

The task chain function receives a number of task signatures and returns a new task signature-the chain signature. The call chain signature serializers performs the task signature it contains, and the execution results for each task signature are passed as the first argument to the next task signature, and only one result is returned.

>>> from celery import chain
>>> from proj.task.tasks import add
>>> Add_chain_sig = C Hain (Add.s (1, 2), Add.s (3))
# simplified syntax
>>> add_chain_sig = (Add.s (1, 2) | Add.s (3))
>>> result = Add_chain_sig.delay ()          # ((1 + 2) + 3)
>>> result.status
u ' SUCCESS '
>>> result.get ( )
6
# returns only final results
>>> result.results
Traceback (most recent call last):
  File "<stdin > ", Line 1, in <module>
attributeerror: ' AsyncResult ' object have no attribute ' results '
# Combine partial function
&  gt;>> Add_chain_sig = Chain (Add.s (1), Add.s (3))
>>> result = Add_chain_sig.delay (3)        # ((3 + 1) + 3)
>>> result.get ()
7

(Serial execution)
Chord Composite Task

When a composite task function generates a task signature, a group signature (which does not support chain signing) is executed, and a callback function is executed when the task group completes.

>>> from proj.task.tasks Import Add, log
>>> from celery import chord, group, chain
>>> ; Add_chord_sig = Chord (Group (Add.s (I, I) for I in range), Log.s ())
>>> result = Add_chord_sig.delay () 
  
   >>> result.status
u ' SUCCESS '
>>> result.get ()
u ' LOG: [0, 2, 4, 6, 8, 10, 12, 16, 14, 18] '
  

Visible task group functions are still executed in parallel, but task groups and callback functions are executed serially, so chord is called a composite task function. Chunks Task Block

Task block functions allow you to divide a large number of objects that need to be processed into task blocks, and if you have 1 million objects, you can create 10 task blocks, each of which handles 100,000 objects. Some people may worry that block processing can cause parallel performance degradation, and in fact, it can greatly improve performance by avoiding the overhead of message delivery.

>>> Add_chunks_sig = add.chunks (Zip (range), range (+))
>>> result = Add_chunks_ Sig.delay ()
>>> result.get ()
[[0, 2, 4, 6, 8, ten,, 20, 22, 24, 26, 28, 30 
 , 32, 34, 3 6, [+], [+], [ 
 80, 
 82, 84, 86, 8, (+), [+], [+], [+], [+], [+] 8, 94, 98], 
 [122, 102, 104, 106, 108,, 118], 
 [120, 124, 126, 128, 130, 132, 13 4, 136, 138], 
 [142, 144, 146, 148, 154, 156], 
 [160, 162, 164, 166, 168, 170, 172, 174, 176, 1 
 [180, 182, 184, 186, 188, 190, 192, 194, 196, 198]]

map/starmap Task Map

mapping functions, similar to the map built-in functions in Python functional programming. is to pass the elements in the sequence object as arguments to a specific function in turn.
The difference between map and Starmap is that the former has only one parameter, and the latter supports multiple parameters.

>>> Add.starmap (Zip (range), range (+))
[Proj.task.tasks.add (*x) for x in [(0, 0), (1, 1), (2, 2), (3, 3) , (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)]]
>>> result = Add.starmap (Zip (range), range ()). Delay ()
>>> result.get ()
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

If you use map to process the Add function, you will get an error because map can only support the passing of one parameter.

>>> Add.map (Zip (range), range (+))
[Proj.task.tasks.add (x) for x in [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)]]
>>> result = Add.map (range, range)). Delay (1)
& Gt;>> result.status
u ' FAILURE '

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.