Python multi-threaded reading of the same file

Source: Internet
Author: User

Python multi-threaded reading of the same file

Python multi-threaded reading of the same file

 

Multi-threaded reading of the same file must be unique and cannot be omitted.

 

At first, I tried a method (which proved to be invalid in practice)

Which lines in the file need to be read by each read thread assigned by the main thread,

For example, thread 1 reads 1-10 rows and thread 2 reads 11-30 rows.

Then, each thread reads data through readline (). If the read row does not belong to the scope of this thread, continue will skip.

Practice has proved that these threads are not read as expected.

My guess is that when open is used to open a file, multiple threads return the same handle,

Or there is only one file pointer.

 

After searching and practicing on the Internet, it is concluded that the following methods support multi-threaded reading of the same file.

1. It is implemented through Queue. The main thread starts a thread to read the file and put the file content in the queue.

Then start several threads to retrieve all data from the queue. The Queue in python is thread-safe.

Http://stackoverflow.com/questions/18781354/is-iterating-over-a-python-file-object-thread-safe

Is iterating over a Python file object thread safe?

 

2. Implemented through linecache. Linecache can specify a row number to read any row of a file. The main thread first assigns the row number to each read thread, and then each thread uses linecache to read the row number.

This method depends on the speed at which linecache reads any row. If it is a large file, it is slow.

For example, thread 1 needs to read 10-20 rows. Assume that thread 1 has its own file pointer. After reading 10 rows, it can quickly locate 11th rows. However, if you use linecache to read a row, it does not matter. Of course, I have not explored how linecache locates any row.

 

3-Point file reading. Python first calls the linux Command head and tail to split a file into several files. Then each read thread is responsible for reading a file.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.