Recently, we have been studying the deduplication (deduplication) storage technology to implement a dedup prototype system. As a result, we encountered an inexplicable problem in coding. The simple code is as follows:
# Include "dedup. H "<br/> # ifndef block_len <br/> # define block_len 32*1024/* 32 K Bytes */<br/> # endif <br/> # define backet_size 10240 <br/>... <br/> int fd_src, fd_dest; <br/> char Buf [block_len] = {0}; <br/> unsigned int rwsize, POs, block_id; <br/> unsigned char md5_checksum [16] = {0}; <br/> unsigned int * metadata = NULL; <br/> unsigned int block_num = 0; <br/> struct stat stat_buf; <br/> hashtable * htable = NULL; <br/> dedup_file_header dedup_hdr; <br/>... <br/> If (-1 = (fd_src = open (argv [1], o_rdonly ))) <br/>{< br/> perror ("open source file"); <br/> return errno; <br/>}< br/>... <br/> htable = create_hashtable (backet_size); <br/> If (null = htable) <br/>{< br/> perror ("create_hashtable "); <br/> return errno; <br/>}< br/> Pos = 0; <br/> block_id = 0; <br/> If (-1 = fstat (fd_src, & stat_buf) <br/>{< br/> perror ("fstat source file "); <br/> goto _ exit; <br/>}< br/> block_num = stat_buf.st_size/block_len; <br/> metadata = (unsigned int *) malloc (sizeof (unsigned INT) * block_num); <br/> If (metadata = NULL) <br/>{< br/> perror ("malloc metadata "); <br/> goto _ exit; <br/>}< br/>
I used a file in the size of 1.9mb as the source file. The result of block_num is 61881344, which is beyond imagination. Block Size: block_len = 32kb, so block_num should be 59. Where is the problem? This code is very simple, and there is no complicated logic. I review it several times and did not find any problems. So I made two rounds at home and accidentally noticed the macro definition of block_len. Define is always easy to make low-level mistakes. Isn't it possible for me to make the lowest-level mistakes?
# Define block_len 32*1024/* 32 K Bytes */
When I saw this line, I was so stupid that I made the lowest and most primitive mistake.
Block_num = stat_buf.st_size/block_len;
After the above macro is replaced, it becomes:
Block_num = stat_buf.st_size/32*1024;
Finally, I understand where the problem is. Add () to the macro, that is, # define block_len (32*1024). Everything is OK!
I have learned a lot from this experience. It is estimated that I will rarely make similar mistakes in the future. In addition, I have little gains and fun ^-^.
Note: macros are easy to use, but be cautious when using them. Pay special attention to the writing format and use parentheses whenever possible to avoid ambiguity. Details are the devil!