Summary of persistent storage module development in php. This requirement is often found in projects, and some large and fixed formatting data needs to be loaded, such as some skill data and items in the battle. These data are read-only data and are often found in projects. you need to load large and fixed formatting data, such as some skill data and items in combat. These data are read-only data and may be relatively large. Currently, there are about tens of thousands of complex data records. if serialize is used, there are about 20 MB of plain text. I tried to put an array directly in the PHP file and found that the file require is time-consuming and may take dozens of milliseconds. at this time, I/O is very heavy, since dozens of MB of data needs to be loaded into the memory, it is quite reliable to investigate sqlite, but the problem is that, for example, writing an operation function is not easy to use; so I came up with the idea of writing an extension myself. That's why the round trip started.
First, call the zend_execute_script method in MINIT to load a php file and return a zval to store it in global variables. After careful consideration, I found that it was just a delusion. The reason is that at MINIT, the php vm has not been fully initialized. it is impossible for you to call the zend_execute_script method, and this method will not return a zval. to get zval, you must get it from EG, very troublesome.
As a result, we tried to use unserialize/serialize. we found that php_var_unserialize can be called in the MINIT phase. So we started to call this method to get a zval, which then exists in the global variable and returns this zval in The get method. After writing the code, you can find it in the test. you only need to call it to get the core. So I checked the document and thought for myself, and finally found that all the variables not allocated by pealloc will be cleared in the PHP_RSHUTDOWN_FUNCTION function. Therefore, data is still normal in the MINIT stage and is free in the Request stage.
Then I checked the document and found that pealloc is provided in php to provide persistent data distribution. Therefore, we need to change the logic and use pealloc to allocate hashtable in the global variable and set hastable to persistent (thank God php hashtable also needs to store code and vm, so this function is available ). But the cup has php_unserialize, which returns only one zval, and you cannot control whether it is persistent. No way. you can only call zend_hash_copy. After writing it, I tested it and found that it was still core. why? When I was eating at noon, I suddenly thought that it could be a problem of shallow copy. zend_hash_copy provided a copy function and I didn't set it. I added the deep copy function and tested it again. I found that it was usable and refreshing to use. Www.2cto.com
The following test shows that the memory usage cannot be tolerated. a 20 m data file is loaded into the memory, which requires about 100 MB of memory. if there are php-cgi processes, that requires 10 GB of memory, which cannot be tolerated. As a result, we can use shared memory to solve this problem. it is enough to read this part of data. The main process of php-cgi is responsible for MINIT operations. the sub-process only needs to read this part of data. However, it is troublesome that php does not provide any interfaces for users to maintain the memory. Therefore, only one function can be used as a function.
Taking a closer look at the hashtable implementation of php, we found that it was complicated and the key was to use the realloc function. this is too speechless. I can't write a memory management. At present, we only use shared memory to implement a simple function for allocating memory by threads, and allocate space from the shared memory in sequence. But fortunately, this part of functionality is not required by the resize function. Because my goal is to copy the zval obtained in php_var_unserialize to the shared memory, and I already know the size. In addition, the updatea function is not required because it is a brand new copy. After the final result is completed, the memory usage is reduced as it can be used.
Next I went to the stress test and suddenly found that the core was started again. this was intolerable. why? According to the core file, it is found that the refcount of hashtable is reduced to 0. As a result, various tests found that the single thread is OK, and only the multi-thread is under high pressure. As a result, the refcount will be modified, and the multi-thread modification may be messy. What should we do? It cannot be locked.
Then I thought about it for a moment and suddenly thought that as long as I change the refcount of zval at the top layer to a value greater than the number of php-cgi processes in the returned zval every time, there is no problem even if it will be changed to chaos, because it will not be changed to 0 at all. So after the modification, I tested it and found it was reliable.
At this point, the entire problem is basically solved. However, another problem is that the core still occurs when Php-cgi is restarted. The reason is that some variables in use were forcibly written as 0. In fact, the correct usage of shared memory is that one process is used for writing and the other process is used for reading. However, in my application, shared memory is used as an absolute address, so it is impossible to write it in one place, read from other places, unless the second parameter in the shmat is changed to a fixed value, but this requires a full understanding of the address allocation of the process to know which memory cannot be used at all. However, this should be fine, because the Php-cgi process has a memory limit, so you should be able to find a piece of memory that cannot be used during the php-cgi running process. However, you need to study the specific situation.
Author does not care about cloud
Bytes. The data is read-only and...