A large number of files need to be processed. Using a php process will be slow. How to lock a file when a process reads a file. Do not allow other processes to read and directly skip this step. Do you want to continue reading other processes? Rename the file being read. After reading the file, rename is returned, which is very inefficient .... A large number of files need to be processed.
Using a php process will be slow.
How to lock a file when a process reads a file.
Do not allow other processes to read and directly skip this step. Do you want to continue reading other processes?
Rename the file being read. After reading the file, rename is returned, which is very inefficient. If there is no better method, you can only use this.
Flock: I tested it. It seems that it is not very useful. I tried it and didn't implement non-blocking reading locks for files.
It is difficult to allocate different files to different processes.
No database. Even if there is. Locking with databases seems to be more inefficient than rename.
Is there any better way to lock files. Because, you only need to read the file.
Reply content:
A large number of files need to be processed.
Using a php process will be slow.
How to lock a file when a process reads a file.
Do not allow other processes to read and directly skip this step. Do you want to continue reading other processes?
Rename the file being read. After reading the file, rename is returned, which is very inefficient. If there is no better method, you can only use this.
Flock: I tested it. It seems that it is not very useful. I tried it and didn't implement non-blocking reading locks for files.
It is difficult to allocate different files to different processes.
No database. Even if there is. Locking with databases seems to be more inefficient than rename.
Is there any better way to lock files. Because, you only need to read the file.
Your problem is:
1. Many files need to be processed by multiple processes to improve efficiency and shorten the total processing time.
2. These processes only need to read files and do not need to write
3. For each file, as long as one process has processed it, no process must handle its needs.
Your requirement is actually to divide the file into multiple groups (you do not have to create a directory on the file system), and then divide and conquer it. In this case, no lock is required.
The lock is not used in this scenario. The lock is used in the following scenarios:
1. The file.txt file records the sales of user1 and user2, and the total sales of user1 + user2.
2. The php1 process is responsible for writing data to user1, and php2 processes are responsible for writing data to user2. The total sales volume of each two processes is displayed to user1 and user2.
3. Both user1 and user2 require writing at the same time, not a few seconds before and after.
We recommend that you solve the problem as follows:
1. Start Multiple PHP processes (nohup php your_script.php your_dir &)
2. each PHP process is assigned a sequence number (assuming four processes, that is, 0, 1, 2, and 3). You can obtain the remainder through the pid modulo operation of the process itself, you can also pass in the command line when starting the process.
3. each process performs the crc32 () operation on the file name before processing the file. The total number of processes is crc32 (file_name) % 4, the result of the modulo operation is equal to the serial number of the process, and the content is read and processed. If the number is not equal, the process is skipped.
Finally: Let me arrange a version for me...
- As @ felix021 said, flock ($ res, LOCK_EX | LOCK_NB) is valid. Please take a good look at the documentation ......
- @ Selling your panties and surfing the Internet said that although memcached is purely memory-based, it has network or unix domain socket overhead. It is not a waste to start a Memcached for a file lock. Shared memory can be used as a lock in Linux. For details, see shm_has_var/shm_put_var in the php manual.
- As you said, it is also possible to allocate a private file to each process, which is not very troublesome. It is more convenient if all worker processes have a master process fork, the simplest and simplest dirty method is to put the file name in the main process array and pop a file name before each fork ......
- The method of using file rename is similar to the method of @ selling your underwear to go online. The overhead is similar to the method of judging whether a lock file exists or not. If there are not many files, the lock is not frequently preemptible, you can do this ......
Apart from the file lock, other self-implemented Locks require self-Unlocking mechanisms when the lock process unexpectedly exits. Therefore, we recommend that you use File locks to automatically release them ......
Flock demo
Function do_flock () {ob_implicit_flush (true); // close the PHP output buffer $ file = _ FILE __; $ f = fopen ($ file, 'R '); $ count = 0; while (1) {$ locked = flock ($ f, LOCK_NB | LOCK_EX); if ($ locked) {echo "got lock \ n "; sleep (10); flock ($ f, LOCK_UN); echo "release lock \ n"; break;} else {echo 'locked by other, WAIT :'. ($ count ++ ). "\ n"; sleep (1 );}}}
Test method:
Time curl -- no-buffer "http: // localhost/flock" // execute the same command in another terminal within 10 seconds
Terminal 1 output:
GOT LOCKRELEASE LOCKreal0m10.023suser0m0.008ssys0m0.008s
Terminal 2 output:
LOCKED BY OTHER, WAIT:0LOCKED BY OTHER, WAIT:1LOCKED BY OTHER, WAIT:2LOCKED BY OTHER, WAIT:3LOCKED BY OTHER, WAIT:4LOCKED BY OTHER, WAIT:5LOCKED BY OTHER, WAIT:6LOCKED BY OTHER, WAIT:7LOCKED BY OTHER, WAIT:8GOT LOCKRELEASE LOCKreal0m19.025suser0m0.008ssys0m0.008s
Ubuntu 12.04 has passed the test and cannot be tested without Mac, but there should be no problem. After all, it is the same root and source, and the PHP source code only has a special implementation for Win ......
Note: The following situations will affect the output results.
- The browser has a rendering buffer. The webkit core Browser needs to output 4 K bytes of white space to start rendering and output.
- If you deploy PHP In FastCGI mode, the Web server may have an output buffer. The Cherokee I use is also a buffer of around 4 K.
For these buffers, you can use str_pad to add 4096 bytes of content during each echo.
1. Why do you think rename efficiency is low? If there are not many files in a directory, this should not be low.
2. You certainly do not have a good-looking flock document.
If you do not want flock () to be blocked during the lock, add LOCK_NB to operation (set to 4 in versions earlier than PHP 4.0.1 ).
Use memcached.
For example, read the file $ filename = "t.txt ";
If (! $ Memcached-> get ($ filename) {// if the file lock does not exist, execute the File Read function // first lock the file, $ memcaced-> save ($ filename, '1'); $ fs = fopen ($ filename, 'r + '); fclose ($ fs ); // after reading, release the file lock $ memcaced-> delete ($ filename);} else {// The file lock already exists, skip}
The above is a memcaced pure memory operation, the speed will be very fast, do not consider the performance issues, of course, there is also a way to use the real file lock, that is, add a new file Method to Control, file contention, but this method will increase the IO overhead.
Flock: I tested it. It seems that it is not very useful. I tried it and didn't implement non-blocking reading locks for files.
I remember.
Http://php.net/manual/zh/function.flo...
LOCK_SH refers to read locks. After locking, other programs can read but cannot write
LOCK_EX is the write lock. After the lock is applied, other programs cannot read or write data.
LOCK_NB (not supported by Windows) is a non-blocking mode. If the lock is not obtained, the system returns immediately.
I think the combination of these three parameters can fully meet the requirements of the landlord.
The simplest solution is to have two files, one write and one read. When the reader is finished, the file name is called after the writer closes.