Recently in writing a search engine, there is an intermediate program is to analyze the word segmentation results file, set up an inverted index. Originally written is single-threaded, inefficient to no language, and then changed to multi-threaded. I thought everything was going to be all right, but when I analyzed nearly 2000 files, there was no difference between inefficient and single-threaded. Open Task Manager, the number of threads display 3 (I set the number of child threads up to 15, plus the start, the program just run the number of threads can reach 20).
Baidu, Windows, a single program of the number of threads is capped, generally only open to about 2000. And in my program for convenience, each sub-thread is set to detach state. In this state, when the thread ends, other threads are not able to reclaim their resources and must be available when the program exits. In other words, when the program has processed nearly 2000 files, the system resources have been exhausted, so efficiency has been reduced.
Knowing this, the problem is solved, the line Cheng to joinable (can be combined) state, in one thread waiting for each sub-thread to end.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Multi-threaded routines run for a long time efficiency drop analysis