Queue module and source code analysis

Source: Internet
Author: User
The Queue module and the source code analysis Queue module provide Queue operations. queues are the most common form of data exchange between threads. This module provides three queues:

Queue. Queue (maxsize): first-in-first-out, maxsize is the size of the Queue, and its value is a non-positive number.

Queue. LifoQueue (maxsize): the latter goes in and out first, which is equivalent to the stack

Queue. PriorityQueue (maxsize): priority Queue.

Here, LifoQueue and PriorityQueue are subclasses of Queue. The three methods share the following:

Qsize (): returns the approximate queue size. Why do we need to add the word "approximate? When the value is greater than 0, the get () method is not blocked during concurrent execution. it is also effective for the put () method.

Empty (): return a Boolean value. if the queue is empty, True is returned. otherwise, False is returned.

Full (): If the queue size is set, True is returned if the queue is full; otherwise, False is returned.

Put (item [, block [, timeout]): adds an element item to the queue. if the block is set to False, a Full exception is thrown if the queue is Full. If the block is set to True and the timeout value is set to None, a queue is added when there is a space available. Otherwise, a Full exception is thrown based on the timeout value set by timeout.

Put_nowwait (item): equivalent to put (item, False ). When block is set to False, if the queue is Empty, an Empty exception is thrown. If the block value is set to True and the timeout value is set to None, the queue is added when there is a space available. Otherwise, the Empty exception is thrown based on the timeout value set by the timeout.

Get ([block [, timeout]): deletes an element from the queue and returns the value of this element. if timeout is a positive number, it blocks a maximum of timeout seconds, if no project is available within this time period, an Empty exception is thrown.

Get_nowwait (): equivalent to get (False)

Task_done (): The Sending signal indicates that the columns task has been completed and is often used in the consumer thread.

Join (): blocking until all elements in the queue are processed and other operations are processed.

(1) source code analysis

The Queue module is easy to use, but I think it is necessary to paste the source code of this module for analysis and learn a lot to see how beautiful and structured the code written by the experts is, think about your own code. it's all tears. come and learn. To reduce the length of the source code, the comments are removed.

from time import time as _timetry:    import threading as _threadingexcept ImportError:    import dummy_threading as _threadingfrom collections import dequeimport heapq __all__ = ['Empty', 'Full', 'Queue', 'PriorityQueue', 'LifoQueue'] class Empty(Exception):    "Exception raised by Queue.get(block=0)/get_nowait()."    pass class Full(Exception):    "Exception raised by Queue.put(block=0)/put_nowait()."    pass class Queue:    def __init__(self, maxsize=0):        self.maxsize = maxsize        self._init(maxsize)        self.mutex = _threading.Lock()        self.not_empty = _threading.Condition(self.mutex)        self.not_full = _threading.Condition(self.mutex)        self.all_tasks_done = _threading.Condition(self.mutex)        self.unfinished_tasks =            def get_nowait(self):        return self.get(False)    def _init(self, maxsize):        self.queue = deque()    def _qsize(self, len=len):        return len(self.queue)    def _put(self, item):        self.queue.append(item)    def _get(self):        return self.queue.popleft()

The following function analysis shows that the Queue object is based on the queue of the collections module (for details about the collections module, refer to Python: count statistics using Counter and collections module ), added the threading module mutex lock and conditional variable encapsulation.

Deque is a dual-end queue and is suitable for queues and stacks. The above Queue object is a first-in-first-out Queue, so the _ init () function first defines a double-end Queue, and then it defines the _ put () and _ get () functions, they are adding elements to the right of the dual-end queue and deleting elements on the left, which constitutes a first-in-first-out queue. Similarly, it is easy to think of the implementation of LifoQueue (Post-in-first-out queue, ensure that you can delete the right side of the queue. You can paste the source code.

class LifoQueue(Queue):    '''Variant of Queue that retrieves most recently added entries first.'''     def _init(self, maxsize):        self.queue = []     def _qsize(self, len=len):        return len(self.queue)     def _put(self, item):        self.queue.append(item)     def _get(self):        return self.queue.pop()

Although its "queue" does not use queue (), the same applies to the list, because the list append () and pop () operations add and delete elements on the rightmost side.

Let's take a look at PriorityQueue, which is a priority queue. here we use the heapq module's heappush () and heappop () functions. The heapq module modularize the heap data structure and can establish this data structure. the heapq module also provides corresponding methods to operate the heap. In the _ init () function, self. queue = [] can be regarded as an empty heap. Heappush () inserts a new value into the heap, and heappop () pops up the minimum value from the heap. This gives priority (this is a simple introduction to the heapq module ). The source code is as follows:

class PriorityQueue(Queue):    '''Variant of Queue that retrieves open entries in priority order (lowest first).     Entries are typically tuples of the form:  (priority number, data).    '''     def _init(self, maxsize):        self.queue = []     def _qsize(self, len=len):        return len(self.queue)     def _put(self, item, heappush=heapq.heappush):        heappush(self.queue, item)     def _get(self, heappop=heapq.heappop):        return heappop(self.queue)

The basic data structure analysis is complete, and the other parts are analyzed.

Mutex is a threading. the Lock () object is a mutex Lock. the not_empty, not_full, and all_tasks_done are all threading. condition () object, Condition variable, and maintain the same mutex Lock object (for details about the Lock object and Condition object in the threading module, refer to the previous blog article Python: thread, process, and coroutine (2) -- threading module ).

Where:

Self. mutex lock: any operation that obtains the queue status (empty (), qsize (), or modifies the Queue Content (get, put, etc.) must hold the mutex lock. Acquire () gets the lock and release () releases the lock. At the same time, the mutex is jointly maintained by three condition variables.

Self. not_empty condition variable: after the thread adds data to the queue, it will call self. not_empty.y Y () to notify other threads, and then wake up a thread to remove elements.

Self. not_full condition variable: when an element is removed from the queue, it will wake up a thread that adds the element.

Self. all_tasks_done condition variable: when the number of unfinished tasks is deleted to 0, all tasks are notified to be completed.

Self. unfinished_tasks: defines the number of incomplete tasks


Let's take a look at the main methods:

(1) put ()

The source code is as follows:

Def put (self, item, block = True, timeout = None): self. not_full.acquire () # not_full get lock try: if self. maxsize> 0: # if the queue length is limited, if not block: # if not blocked, if self. _ qsize () = self. maxsize: # if the queue is Full, throw an exception. raise Full elif timeout is None: # Blocked and time-out is null. wait while self. _ qsize () = self. maxsize: self. not_full.wait () elif timeout <0: raise ValueError ("'timeout' must be a non-negative number") else: # if there is congestion and timeout is not negative, end time = current time + timeout endtime = _ time () + timeout while self. _ qsize () = self. maxsize: remaining = endtime-_ time () if remaining <= 0.0: # After that, an exception is thrown. raise Full # if the queue is Full, it will always be suspended, until there is a "location" to free up self. not_full.wait (remaining) self. _ put (item) # Call the _ put method to add the element self. unfinished_tasks + = 1 # unfinished task + 1 self. not_empty.policy () # The notification is not empty, and the finally: self. not_full.release () # not_full release lock

By default, block is True and timeout is None. If the queue is full, it will wait. if the queue is not full, the _ put method will be called to add the process to deque (described later), and the queue will be notified not to be empty when the unfinished task is added with 1.

If the block parameter is set to Flase, an exception is thrown when the queue is full. If a timeout value is set, the block is blocked before the time is reached, and an exception is thrown when the time is reached. This method uses the not_full object for operations.

(2) get ()

The source code is as follows:

Def get (self, block = True, timeout = None): self. not_empty.acquire () # not_empty get lock try: if not block: # if not self. _ qsize (): # throwing an exception when the queue is Empty raise Empty elif timeout is None: # if the queue is Empty, it will wait for while not self. _ qsize (): self. not_empty.wait () elif timeout <0: raise ValueError ("'timeout' must be a non-negative number") else: endtime = _ time () + timeout while not self. _ qsize (): remaining = endtime-_ time () if remaining <= 0.0: raise Empty self. not_empty.wait (remaining) item = self. _ get () # Call the _ get method to remove and obtain the Project self. not_full.notify () # return item for notification not full # return project finally: self. not_empty.release () # release the lock

The logic is the same as the put () function. by default, if the queue is empty, it will wait. Otherwise, the _ get method will be called to remove and obtain an item, finally, this item is returned. This method uses the not_empty object for operations.

However, I think it is better to combine put () and get () functions. Not_full and not_empty represent two different types of threads, not_full can be understood as is-not-full, that is, whether the queue is full, the default is not full, when the condition variable not_full is not full, the lock can be obtained and some condition judgment is made. only when the condition is met can the element be added to the queue, after successful addition, the not_empty condition variable queue will be notified that it is not empty. "I" has just been added to an element to meet the basic conditions for performing the delete action (the queue is not empty, it makes no sense to execute the delete action if it is empty.) at the same time, wake up some suspended threads that execute the remove action and let these threads re-determine the conditions, if conditions permit, the deletion action will be performed, and the not_full condition variable will be notified to "it" that the queue is not full, because "I" deleted an element just now (think if the queue is full of adding elements, it doesn't make sense ), the basic conditions for adding elements are met (the queue is not full). at the same time, the threads that are suspended to execute the add action will be awakened, and these threads will make condition judgments, if the conditions are met, an element is added; otherwise, the element is suspended. And so on. at the same time, thread security is ensured. As mentioned above, when an element is removed from the queue, a thread for adding an element is awakened. When an element is added, a thread for deleting the element is awakened.

(3) task_done ()

The source code is as follows:

Def task_done (self): self. all_tasks_done.acquire () # obtain the lock try: unfinished = self. unfinished_tasks-1 # determine whether all tasks of a thread in the queue are completed. if unfinished <= 0: # Yes, a notification is sent, or if unfinished <0: raise ValueError ('task _ done () called too previous times ') self. all_tasks_done.policy_all () self. unfinished_tasks = unfinished # otherwise, the number of unfinished tasks-1 finally: self. all_tasks_done.release () # Last Release Lock

This method is used to determine whether all tasks of a thread in the queue are completed. First, the lock is obtained through the all_tasks_done object. If yes, a notification is sent and the lock is released.


(4) join ()

The source code is as follows:

Def join (self): self. all_tasks_done.acquire () try: while self. unfinished_tasks: # if there are unfinished tasks, the wait () method will be called to wait for self. all_tasks_done.wait () finally: self. all_tasks_done.release ()

Blocking method. when there are incomplete processes in the queue, call the join method to block them until they are all completed.


Other methods are relatively simple and easy to understand. if you are interested, you can check Queue. for the source code in py, you must note that any operation that gets the queue status (empty (), qsize (), or modifies the Queue Content (get, put, etc) mutex must be held.

(2) Simple example

Implement a thread to continuously generate a random number into a queue

Implement a thread to constantly retrieve odd numbers from the above queue

Implement another thread to constantly retrieve even numbers from the above queue

Import random, threading, timefrom Queue import Queueis_product = Trueclass Producer (threading. thread): "production data" def _ init _ (self, t_name, queue): threading. thread. _ init _ (self, name = t_name) self. data = queue def run (self): while 1: if self. data. full (): global is_product = False else: if self. data. qsize () <= 7: # when the queue length is less than or equal to 7, add the element is_product = True for I in range (2): # Add two elements randomnum = ra to the queue each time Ndom. randint (1, 99) print "% s: % s is producing % d to the queue! "% (Time. ctime (), self. getName (), randomnum) self. data. put (randomnum, False) # store data in the queue time in sequence. sleep (1) print "deque length is % s" % self. data. qsize () else: if is_product: for I in range (2): # randomnum = random. randint (1, 99) print "% s: % s is producing % d to the queue! "% (Time. ctime (), self. getName (), randomnum) self. data. put (randomnum, False) # store data in the queue time in sequence. sleep (1) print "deque length is % s" % self. data. qsize () else: pass print "% s: % s finished! "% (Time. ctime (), self. getName () # Consumer threadclass Consumer_even (threading. thread): def _ init _ (self, t_name, queue): threading. thread. _ init _ (self, name = t_name) self. data = queue def run (self): while 1: if self. data. qsize ()> 7: # when the queue length is greater than 7, the val_even = self. data. get (False) if val_even % 2 = 0: print "% s: % s is consuming. % d in the queue is consumed! "% (Time. ctime (), self. getName (), val_even) time. sleep (2) else: self. data. put (val_even) time. sleep (2) print "deque length is % s" % self. data. qsize () else: pass class Consumer_odd (threading. thread): def _ init _ (self, t_name, queue): threading. thread. _ init _ (self, name = t_name) self. data = queue def run (self): while 1: if self. data. qsize ()> 7: val_odd = self. data. get (False) if val_odd % 2! = 0: print "% s: % s is consuming. % d in the queue is consumed! "% (Time. ctime (), self. getName (), val_odd) time. sleep (2) else: self. data. put (val_odd) time. sleep (2) print "deque length is % s" % self. data. qsize () else: pass # Main threaddef main (): queue = Queue (20) producer = Producer ('pro. ', queue) consumer_even = Consumer_even ('con _ even. ', queue) consumer_odd = Consumer_odd ('con _ odd. ', queue) producer. start () consumer_even.start () consumer_odd.start () producer. join () consumer_even.join () consumer_odd.join () if _ name _ = '_ main _': main ()

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.