Block ID
Each block in TFS has a unique identifier. The current implementation is a uint32_t integer ID. Each time a block is added, a new ID is assigned to it, the specific implementation method is to save the value of global global_block_id. This value is directly added with 1 for each allocation as the new blockid. The code implementation is roughly as follows. The generate function is called every allocation.
Class blockidfactory {public: uint32_t generate () {return ++ global_block_id;} PRIVATE: uint32_t global_block_id; // The global ID is persistently saved and loaded at each start };
Recently, due to the needs of the Erasure code project, we upgraded blockid to 64bit, because uint32_t is used directly when blockid is used in the code, and there are many places to use, almost every source code file is involved. To upgrade blockid to uint64_t type, it means that we need to modify all the places where blockid is used, and change the definition from uint32_t to uint64_t during serialization (stored on disk or transmitted over the network) to be changed to the serialization interface of uint64_t, there are a lot of places to modify.
A simple type modification may cause a lot of work, because we did not properly hide the Implementation Details of blockid. At the beginning, we stipulated that blockid is an integer of the uint32_t type, in fact, blockid can be of any type, value, string type, or a more complex struct. The above Code uses the generate function (instead of using the new_block_id = ++ global_block_id statement in the Code) to hide the blockid generation details. If the generation policy is changed, you only need to modify the implementation of generate. However, because the code exposes the blockid type details, it is not easy to expand.
If the blockid type is abstracted as a blockidtype, because it is a uint32_t integer during initial implementation, typedef is used to define the type as uint32_t. The general implementation is as follows, blockidtype is used in all scenarios where blockid is used, instead of uint32_t. To upgrade blockid to 64bit or change it to the string type, we only need to modify it within the blockidfactory class, all the modification details are not exposed outside blockidfactory, and the block ID of the blockidtype is still seen outside.
typedef uint32_t BlockidType;class BlockIdFactory {public: BlockidType generate() { return ++global_block_id; }private: uint32_t global_block_id; };
In many cases, if you cannot intuitively see an object type, you should define it as an abstract data type to facilitate expansion. For example, the length of an array, the number of pages in a book, and the weight of a person can be intuitively indicated as numerical values. The uint32_t and uint64_t types can be directly used according to the actual situation; in this example, we cannot intuitively see what the block identifier is, so we should first design it as an abstract type.
Message serialization
All messages in TFs that need to be transmitted over the network (client request messages, server response messages, etc.) will implement the serialize/deserialize interface for serialization/deserialization, each time you add a new message, you need to write a serialization interface for the message, which is basically a mechanical task. For the member type, call the corresponding serialization function. The code is roughly as follows.
class SomeMessage: public BaseMessage {public: void serialize(DataBuffer& output) { output.write_int32(foo); output.write_int64(bar); } void deserialze(DataBuffer& input) { input.read_int32(&foo); input.read_int64(&bar); }private: int32_t foo; int64_t bar;};
It can be seen that the serialization/deserialization work is quite boring, basically repetitive work, and can be avoided through better design. I have seen some people replace these repetitive tasks with macros, which makes the workload of adding messages very small. However, using macros in large quantities is not a good choice, which affects the readability of the Code, it is also inconvenient to locate the problem. Google protobuf can easily solve the problem, and its codec efficiency and space utilization are very high. With protobuf, you only need to pay attention to the message content. serialization/deserialization will help you solve the problem. It is very easy to add new messages and modify existing messages. There are many open-source tools. The implementation of many open-source products is higher than the quality of writing code with the same function. With some mature open-source products, you will have to do more easily, this allows you to focus more on things.
Code reuse
Some lazy people reuse multiple messages to write less messages. In different scenarios, each member represents different meanings, so the following code appears.
class GeneralMessage:public BaseMessage { public: void serialize(DataBuffer& output); void deserialze(DataBuffer& input);private: int32_t type; int64_t value1; int64_t value2; int64_t value3; int64_t value4; };
In the actual application, each request corresponds to a type, under different types, the value1-value4 represents a different meaning, for example, in which server the block on, value1 represents blockid, when querying which blocks on the server, value1 represents the serverid. After the server receives the message, it will make different interpretations based on the type, for requests within four members, basically, this message can be reused, which reduces the amount of code, but greatly affects the readability of the Code. So far, I have not fully figured it out, what is the meaning of each value in each case. In fact, on the basis of the above ideas, we can write less code without affecting the readability of the Code. The general idea is as follows.
Class generalmessage: Public basemessage {public: void serialize (databuffer & output); void deserialze (databuffer & input); protected: incluvalue1; incluvalue2; int64_t value3; int64_t value4 ;}; // codec messages are inherited directly without class specialmessage: Public generalmessage {public: void set_block_id (const uint64_t block_id) {value1 = block_id;} uint64_t get_block_id () const {return value1 ;} void set_server_id (const uint64_t server_id) {value2 = server_id;} uint64_t get_server_id () const {return value2 }};
The preceding encapsulation shows that specialmessgae has two important fields: blockid and serverid./GetYou can't see the actual data stored in a variable named "poor". People who read the code can also use the set/get interface to know which fields are being set. Encapsulation may bring about some additional overhead, but in most cases, the benefits of encapsulation will make these "overhead" negligible, but do not "overpackage ".