For the day IP is not high or the number of concurrent numbers is not a big application, generally do not consider these! There is no problem with the normal file operation method. However, if the concurrency is high, when we read and write to the file, it is very likely that multiple processes to the file to operate, if the file access to the corresponding exclusive, it is easy to cause data loss.
For example: an online chat room (which assumes that the chat content is written to a file), at the same time, user A and User B have to manipulate the data to save the file, first is a open the file, and then update the data inside, but here B also just opened the same file, also ready to update the data inside. When a writes a good file, here actually B has the file open. But when B keeps the file back, this has caused the loss of data, because the B user is completely unaware of the file it opened when it made changes to it, a user also changed the file, so the last B user to save changes, User A's update will be lost.
For such a problem, the general solution when a process is operating on a file, locking the other first means that only the process has permission to read the file, and if the other process is reading it now, it is completely fine, but if there is a process that is trying to update it, it will be rejected by the operator. The process in which the file was previously locked then if an update to the file is completed, the exclusive identity is released, and the file is restored to a state that can be changed. Next, similarly, if the process in the operation of the file, the file is not locked, at this time, it can confidently boldly lock the file, enjoy alone.
So the general solution would be:
$fp =fopen ('/tmp/lock.txt ', ' w+ ');
if (Flock ($FP, lock_ex)) {
fwrite ($fp, "Write something here\n");
Flock ($FP, lock_un);
else{
echo ' couldn\ ' t lock the file! ';
}
Fclose ($FP);
But in PHP, Flock doesn't seem to work that well! In the case of multiple concurrency, it seems to be often exclusive resources, not immediately released, or is not released at all, resulting in deadlock, so that the server's CPU consumption is very high, and sometimes even the server to completely die. It seems to happen in many Linux/unix systems. So before using flock, you must consider carefully.
So there's no solution? It's not really like that. If flock () we use it properly, it is entirely possible to solve the deadlock problem. Of course, if you do not consider using the flock () function, there will also be a good solution to solve our problems. After my personal collection and summary, roughly summed up the solutions are as follows.
Scenario One: When you lock a file, set a time-out period. Generally implemented as follows:
if ($fp =fopen ($fileName, ' a ')) {
$startTime =microtime ();
do{
$canWrite =flock ($fp, LOCK_EX);
if (! $canWrite) {
usleep (round (rand (0,100) *1000));
}
while ((! $canWrite) && ((Microtime ()-$startTime) <1000);
if ($canWrite) {
fwrite ($fp, $dataToSave);
}
Fclose ($FP);
Timeout is set to 1ms, if there is no lock in time, it is repeatedly obtained, directly to the right to operate the file, of course. If the timeout limit is already in place, you must exit immediately, leaving the lock to allow other processes to operate.
Scenario Two: Do not use the Flock function to borrow temporary files to solve the problem of read-write conflict. The general principle is as follows:
(1) Consider a copy of the file that needs to be updated to our temp file directory, save the file last modification time to a variable, and take a random, not easily duplicated file name for this temporary file.
(2) When the temporary file is updated, then detect the original file's last update time and the previous saved time is consistent.
(3) If the last modification time is consistent, the modified temporary files are renamed to the original file, in order to ensure the status of the file synchronization update, so you need to clear the file status.
(4) However, if the last modification time is consistent with the one previously saved, this means that during this period the original file has been modified, and then the temporary file needs to be deleted and then returned false, stating that there are other processes in the file at this time.
The approximate implementation code is as follows:
$dir _fileopen= ' tmp '; function Randomid () {return time (). substr (MD5 (Microtime ()), 0,rand (5,12));} function Cfopen ($filename, $mode) {Glob
Al $dir _fileopen;
Clearstatcache ();
do{$id =md5 (Randomid (rand (), TRUE); $tempfilename = $dir _fileopen. '
/'. $id. MD5 ($FILENAME);
while (File_exists ($tempfilename));
if (file_exists ($filename)) {$newfile =false;
Copy ($filename, $tempfilename);
}else{$newfile =true;
$FP =fopen ($tempfilename, $mode);
Return $fp Array ($fp, $filename, $id, @filemtime ($filename)): false; function Cfwrite ($fp, $string) {return fwrite ($fp [0], $string);} function Cfclose ($fp, $debug = ' off ') {global $dir _fil
eopen;
$success =fclose ($fp [0]);
Clearstatcache (); $tempfilename = $dir _fileopen. '
/'. $fp [2].md5 ($fp [1]); if (@filemtime ($fp [1]) = = $FP [3]) ($fp [4]==true&&!file_exists ($fp [1])) $FP [5]==true] {rename ($tempfilename,
$fp [1]);
}else{unlink ($tempfilename); Indicates that there are other processes in operation of the target file, when the forwardThe process was rejected $success =false;
return $success;
} $fp =cfopen (' lock.txt ', ' A + ');
Cfwrite ($FP, "Welcome to beijing.\n"); Fclose ($fp, ' on ');
For the function used by the above code, you need to explain:
(1) rename (); Renames a file or a directory that is more like the MV in Linux. It is convenient to update the path or name of the file or directory. But when I test the above code in window, if the new filename already exists, a notice is given, saying that the current file already exists. But working under Linux is good.
(2) Clearstatcache (); Clears the state of the file. PHP will cache all file attribute information to provide higher performance, but sometimes, multiple processes in the file to delete or update operations, PHP did not have time to update the cache file properties, easy to access to the last update time is not real data. So here you need to use this function to clear the saved cache.
Program III: The operation of the file to read randomly, to reduce the likelihood of concurrency.
This scenario appears to be more widely used when logging a user's access log. Prior to the need to define a random space, the larger the space, the greater the likelihood of concurrency, where random read-write space is assumed to be [1-500], then our log file distribution for log1~ to log500 ranged. Each time a user accesses, the data is randomly written to any file between the log1~log500. At the same time, there are 2 processes for logging, a process may be updated LOG32 files, and b process? Then the update may be log399. To know, if you want to let the B process also operate LOG32, the probability is basically 1/500, almost equal to zero. When we need to analyze the access logs, we just need to merge the logs and analyze them. Using this scenario to record one benefit of the log, process operations are less likely to queue, allowing the process to complete each operation quickly.
Scenario Four: Put all the processes to be manipulated into one queue. Then a dedicated service completes the file operation. Each of the excluded processes in the queue corresponds to the first concrete operation, so the first time our service only need to get from the queue equivalent to the specific operational matters can be, if there are a large number of file operation process, it does not matter, platoon to the back of our queue, as long as willing to row, the queue is no matter how long.
For the previous several schemes, each has its own advantages! May be summed up in two categories:
(1) Need to queue (impact slow) such as program one, two or four
(2) There is no need to queue. (Impact fast) Programme III
When designing a caching system, we generally do not use scenario three. Because scenario three analysis program and write program is not synchronized, at the time of writing, completely do not take into account the difficulty of analysis, just write the line. Imagine if we were to update a cache, and if we used random file reads, there would seem to be a lot more flow in the read cache. But the adoption of scenario one or two is completely different, although write time needs to wait (when getting lock is unsuccessful, will be repeatedly acquired), but read the file is very convenient. The purpose of adding caching is to improve system performance by reducing data-reading bottlenecks.
From the personal experience and a summary of some information, there is nothing wrong, or did not mention the place, I welcome you to correct.
Articles that you may be interested in
- Use PHP's gzip compression feature to compress Web site JS and CSS files to speed up Web site access
- Problem resolution in PHP about abstract (abstract) classes and abstract methods
- How PHP clears HTML formatting and removes spaces from text and then intercepts text
- PHP empty (delete) files in the specified directory, do not delete the Directory folder method
- A program in which PHP gets all the files in the directory and saves the results to an array
- How PHP serializes variables the method of the PHP four kinds of serialization variables
- Use PHP function memory_get_usage to get current PHP memory consumption to achieve program performance optimization
- PHP error_log () writes error messages to a file