Python multithreading reads the same file

Source: Internet
Author: User

Python multithreading reads the same file

Multi-threaded Read the same file, the requirements can not be duplicated, can not be omitted.


A method was first tried (and later proved ineffective)

The main thread is assigned to each read thread to read which rows in the file,

For example, thread 1 reads 1-10 lines, thread 2 reads 11-30 rows.

Then each thread reads through readline () , and the read line is skipped if it does not fall within the scope of this thread. Continue

Practice has shown that several threads are not read as we expect.

My guess is that by opening a file with open, multiple threads return the same handle,

Or a file with only one file pointer.

After online search and practice, summed up the following methods to support multithreading to read the same file.

1 is achieved through queue queues. The main thread initiates a threads to read the file and puts the contents of the file into the queue.

It then starts several threads, all of which fetch data from the queue. The Queue in python is thread-safe.

Http://stackoverflow.com/questions/18781354/is-iterating-over-a-python-file-object-thread-safe

Is iterating over a Python file object thread safe?

2 is achieved through Linecache . linecache can specify line numbers to read any line of a file. The main thread first assigns the line number that each read thread reads separately, and then the threads are read with Linecache according to the line number.

This method relies on the speed at which Linecache reads any line, which is slower if it is a large file.

For example, thread 1 needs to read 10-20 rows. Assuming thread 1 has its own file pointer, read the line, you can directly quickly locate to the first row . But with Linecache reading, it doesn't matter if you read a line every time. Of course, I have not explored the principles of how Linecache can be positioned on any line.

3 -minute file read. python first calls the Linux command head and tail, dividing a file into several files. Then each read thread is responsible for reading a file.


Python multithreading reads the same file

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.