Solution to conflicts between concurrent read/write files in php _ php Digest

Source: Internet
Author: User
Tags flock
Here we provide four high-concurrency file read/write solutions, each with its own advantages. you can solve php concurrent file read/write conflicts based on your own situation. For applications with low daily IP addresses or a low number of concurrent jobs, do not consider this! There is no problem with the normal file operation method. However, if the concurrency is high, when we perform read and write operations on the file, it is very likely that multiple processes operate on the file into one. if the access to the file is not exclusive at this time, this can easily cause data loss.
For example, in an online chat room (where the chat content is written to A file), at the same time, both user A and User B must operate on the data to save the file. First, User A opens the file, then update the data, but here B also opens the same file, and is also ready to update the data. When A Saves the written file, B actually opens the file. However, when B saves the file back, data is lost because user B does not know the file opened by B when it changes it, user A also changed the file, so when user B saves the change, User A's update will be lost.
For such a problem, the general solution is that when a process operates on the file, it first locks the other, meaning that only the process has the right to read the file, if other processes are read now, there is no problem at all, but if a process attempts to update it, the operation will be rejected, the previous process that locks the file. if the file update operation is completed, the exclusive identifier is released, and the file is restored to a changeable state. Next, if the process does not lock the file when operating the file, it can safely lock the file and enjoy it on its own.
The general solution is:

The code is as follows:


$ Fp = fopen ('/tmp/lock.txt', 'W + ');
If (flock ($ fp, LOCK_EX )){
Fwrite ($ fp, "Write something here \ n ");
Flock ($ fp, LOCK_UN );
} Else {
Echo 'fulldn \'t lock the file! ';
}
Fclose ($ fp );


But in PHP, flock does not seem to work so well! In the case of multi-concurrency, it seems that resources are frequently exclusive, not immediately released, or not released at all, resulting in deadlocks, so that the cpu usage of the server is very high, and sometimes the server is permanently killed. This may happen in many linux/unix systems. Therefore, you must consider it carefully before using flock.
So there is no solution? This is not the case. If flock () is used properly, it is possible to solve the deadlock problem. Of course, if you do not consider using the flock () function, there will also be a good solution to solve our problem. After my personal collection and summary, I have summarized the following solutions.
Solution 1: Set a timeout value when locking an object. The general implementation is as follows:

The code is as follows:


If ($ fp = fopen ($ fileName, 'A ')){
$ StartTime = microtime ();
Do {
$ CanWrite = flock ($ fp, LOCK_EX );
If (! $ CanWrite ){
Usleep (round (rand (0,100) * 1000 ));
}
} While ((! $ CanWrite) & (microtime ()-$ startTime) <1000 ));
If ($ canWrite ){
Fwrite ($ fp, $ dataToSave );
}
Fclose ($ fp );
}


The timeout value is set to 1 ms. if the lock is not obtained within the time range, the lock is obtained repeatedly and directly until the file operation permission is obtained. of course. If the timeout limit is reached, you must exit immediately to allow other processes to operate the lock.

Solution 2: Use temporary files instead of the flock function to solve read/write conflicts. The general principle is as follows:
(1) take the files to be updated into consideration a copy to our temporary file directory, Save the Last modification time of the files to a variable, and take a random copy for the temporary file, it is not easy to duplicate the file name.
(2) after the temporary file is updated, check whether the last update time of the original file is consistent with the previously saved time.
(3) If the last modification time is the same, rename the modified temporary file to the original file. to ensure that the file status is updated synchronously, clear the file status.
(4) However, if the last modification time is the same as that saved previously, it indicates that the original file has been modified during this period. in this case, you need to delete the temporary file, then false is returned, indicating that other processes are operating on the file.
The implementation code is as follows:

The code is as follows:


$ Dir_fileopen = 'tmp ';
Function randomid (){
Return time (). substr (md5 (microtime (), 0, rand (5, 12 ));
}
Function cfopen ($ filename, $ mode ){
Global $ dir_fileopen;
Clearstatcache ();
Do {
$ Id = md5 (randomid (rand (), TRUE ));
$ Tempfilename = $ dir_fileopen. '/'. $ id. md5 ($ filename );
} While (file_exists ($ tempfilename ));
If (file_exists ($ filename )){
$ Newfile = false;
Copy ($ filename, $ tempfilename );
} Else {
$ Newfile = true;
}
$ Fp = fopen ($ tempfilename, $ mode );
Return $ fp? Array ($ fp, $ filename, $ id, @ filemtime ($ filename): false;
}
Function cfwrite ($ fp, $ string ){
Return fwrite ($ fp [0], $ string );
}
Function cfclose ($ fp, $ debug = 'off '){
Global $ dir_fileopen;
$ Success = fclose ($ fp [0]);
Clearstatcache ();
$ Tempfilename = $ dir_fileopen. '/'. $ fp [2]. md5 ($ fp [1]);
If (@ filemtime ($ fp [1]) = $ fp [3]) | ($ fp [4] = true &&! File_exists ($ fp [1]) | $ fp [5] = true ){
Rename ($ tempfilename, $ fp [1]);
} Else {
Unlink ($ tempfilename );
// Indicates that other processes are operating on the target file and the current process is denied.
$ Success = false;
}
Return $ success;
}
Using fp=cfopen('lock.txt ', 'A + ');
Cfwrite ($ fp, "welcome to beijing. \ n ");
Fclose ($ fp, 'on ');


For the functions used in the above code, you need to describe them as follows:
(1) rename (); rename a file or directory. This function is actually more like a music video in linux. It is convenient to update the path or name of a file or directory. However, when I test the code above in window, if the new file name already exists, a notice will be given, indicating that the current file already exists. But it works well in linux.
(2) clearstatcache (); clears the file status. php caches all file attributes to provide higher performance. However, when multiple processes delete or update files, php does not have time to update the file attributes in the cache, the last update time is not real data. Therefore, you need to use this function to clear the saved cache.

Solution 3: Perform random read/write on the operated files to reduce the possibility of concurrency.
When recording user access logs, this solution seems to be widely used. Previously, we needed to define a random space. the larger the space, the lower the possibility of concurrency. Here we assume that the random read/write space is [1-500], then the distribution of our log files is log1 ~ To log500. Data is randomly written to log1 for each user access ~ Any file between log500. At the same time, there are two processes that record logs. process A may be an updated log32 file, but what about process B? At this time, the update may be log399. you need to know that if you want the B process to operate log32, the probability is basically 1/500, which is about to be equal to zero. To analyze access logs, we only need to merge these logs before analyzing them. When this scheme is used to record the benefits of logs, the possibility of queuing process operations is relatively small, so that the process can quickly complete each operation.

Solution 4: put all processes to be operated into a queue.Then, a specific service is provided to complete file operations. Each excluded process in the queue is equivalent to the first specific operation. Therefore, for the first time, our service only needs to obtain the Operation items from the queue, if there are a large number of file operation processes here, it doesn't matter, it's just after our queue. as long as you want to arrange the queue, it doesn't matter how long it will be.

Each of the previous solutions has its own advantages! There may be roughly two categories:
(1) queuing (slow impact), for example, solution 1, solution 2, and solution 4
(2) No queue is required. (Fast impact) solution 3
When designing a cache system, we generally do not use Solution 3. Because the analysis program in solution 3 is not synchronized with the writing program, during the writing time, the analysis difficulty is not considered at all, just write the rows. Imagine, for example, if we use the random file read/write method when updating a cache, it seems that a lot of procedures will be added when reading the cache. But the method 1 and 2 are completely different. Although the write time needs to wait (when the lock cannot be obtained successfully, it will be obtained repeatedly), it is very convenient to read the file. The purpose of adding a cache is to reduce the data read bottleneck and improve system performance.
From the summary of my personal experience and some materials, you can tell me what is wrong or what you haven't talked about.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.