In the past few days, we have extended some DFS functions, especially the batch upload function and the file upload path function. These functions make it easier for us to manage DFS, saves more resources.
Let's talk about the function of changing the file path. This function was changed by one of my colleagues. It is understood that he found the storage_gen_filename function in the storage_service.c file, when organizing the file storage path, add the "yyyymm"CodeIn this case, you should note that it is not just okay to change it here, because DFS will create a directory when it starts the storage service for the first time, you also need to make changes when creating a directory. Remember to do one thing after these changes. This happened later in our official environment. The DFS directory is stored in the next folder after the maximum number of files in a folder is reached, in this case, a "counter" problem occurs. When the month changes, according to the code we changed, the image will be stored in the next "yyyymm" folder. At this time, because of the relationship between the DFS file counters, the counter is not 0, therefore, the obtained file path will be yyyymm \ XX, instead of yyyymm \ 00 \ 00, because the file path must start from the minimum starting point of the month. At this time, we need to add a few lines of code in the storage_gen_filename function:
Time_t times; time (& times );StructTM * timep = gmtime (& times );If(1 = timep-> tm_mday & 0 = success) {g_dist_path_index_high = 0; g_dist_path_index_low = 0; g_dist_write_file_count = 0; then (); then ++ ;}If(1! = Timep-> tm_mday) {storage_path_month_change_state = 0 ;}
This code is actually very efficient, because it should be executed once a month, but it is currently written in the storage_gen_filename function. It will be executed every time a file is uploaded. At least the judgment should be executed, if you are interested, you can write this code as an event and execute it once every month.
The batch file upload function is also added. This function is complicated to implement, and involves the DFS communication protocol and the thread_stack_size of the DFS system. But it's okay to have someone to help. It's finally done.
First, let's talk about the Protocol. DFS actually uses byte values for communication protocols. For details, see the protocols defined in the tracker_proto.h file. For example, the tracker protocol for checking storage health status is 83. The following is the transmission content. First, the DFS client will transmit a proto package. The package length is 10 bytes, eight of which are the length of a body, the body is actually the processing content sent by the client to the server in a regular manner. The next 1 byte is the protocol value just mentioned, telling the DFS server what to do; the last 1 byte is the error code, in fact, this error is useless when the client sends a message to the server. It is mainly used to send a server error to the client.
The content transmitted below consists of several parts. The first part is the content length, such as the file extension length, metedata length, and file bytes length. These lengths are a long value, these long values are all transmitted using Byte arrays. The length of each byte array is 8. Therefore, the fixed length content is transmitted first, after receiving and parsing, the server can accept the corresponding data in the network stream according to the length. In this way, we can get the meaning of different stream locations in the following stream. The actual content is transmitted below, the extension is transmitted first, metedata is transmitted, and finally the file bytes array. These can be merged into one bytes for transmission, because the above has Parsed the bytes length of different data, so it can be intercepted from the stream.
Batch upload is the same, but a new DFS protocol value is added. I use 127. Of course, this value can be defined by myself, as long as it is not used in DFS. However, you must ensure that the protocols defined by your client and server are the same. Otherwise, it cannot be parsed. Note that the value thread_stack_size is configured in the config file, which indicates the maximum memory usage of a DFS single thread, during batch upload, Because you process more content, it is easy to cause memory overflow. Therefore, it is best to adjust this value to 2 MB after batch upload is added, of course, in order to optimize the performance, I have processed the DFS Server File Parsing Method Using the single file processing method. So when you look at the server or client code, you will see a for loop to process the byte array of a single file.