Experience the memory alignment error in Solaris

Source: Internet
Author: User

A while ago, I was working on a program module. Its basic function was to read a file in a specified format, and then organize the file's descriptive content into a data structure in the memory. In this development task, we only need to generate memory data, and how to process the data is done by other groups. Therefore, the data structure is also defined by the other party. After a series of design, development, and testing, we successfully completed the development of modules on Windows and integrated them into their programs by other groups. However, a bug ticket was sent not long ago, saying that the entire program crashes when our module runs under Solaris, which may affect the timely release of the entire software .... As the main design and developer of this module, I quickly started to debug this module. Unfortunately, our Organization cannot build a program debugging environment under Solaris, therefore, the entire debugging process can only be performed in windows, and then sent to Solaris for compilation. Finally, the program is executed to observe its effect. After careful investigation, I confirmed that the program had no logic problems, but it was true that the program would crash on Solaris. What should I do? Then I asked a C ++ expert, the program's memory alignment error was roughly identified. Then, after nearly a week of Repeated modifications, we finally solved the bug, next we will briefly introduce the beginning and end of this bug.

First, we need to build a set of layered tree structures defined by struct. Suppose there are several types (the actual system needs to be more complex, so we can simplify the description):
Typedef struct dB {
Module ** modulelist;/* dynamic list of modules */
} DB;

Typedef struct module {
Struct inst ** instlist;/* dynamic list of instances */
} Module;

Typedef struct inst {
Char * Name;/* Instance name */
} Inst;

Their actual structure relationships are as follows:

As you can see, the final result data will have a total dB node, and its modulelist will point to a series of module nodes, instead of directly pointing to the module node, instead, a module * array is used to indirectly locate the module node. There is also an instlist in the module, which can be used to specify a series of inst hanging under the module. They are also indirectly associated through an inst * array. Obviously, the two arrays used above must be in the continuous space, but since the entire memory dB is dynamically created by reading the file content, so how does the actual program know how much data its modulelist and instlist contain? The answer is the int length data above the module * array (and the inst * array, it is placed above the memory pointed by the modulelist pointer, and an int value is used to represent the length of the array. When you need to traverse the modulelist in the database, you only need to obtain the array length from the memory address above according to the modulelist pointer, and then you can access each module as you would access an ordinary array. Speaking of this, I really have to admire the clever design of the designer. We all know that in list access, the array speed is the fastest, however, the biggest disadvantage of arrays is that it is impossible to dynamically increase data (unless realloc () is used to re-allocate the memory), and the current structure is used, the dynamic data is organized into an array, which is very efficient for data access.
However, this structure poses a great challenge to our data generation module. We cannot know how much data is stored in the actual file, when I read a piece of data appended, the repeated memory allocation will certainly not be avoided. Therefore, after research, we decided to adopt this method: simply traverse the entire file in advance, count the total number of data types (Fortunately, it is usually a line of text for each type, and the statistics are very convenient), allocate all the required memory, and then read the file from the beginning, read files while allocating space from pre-allocated memory to the objects to be generated.
The entire processing process is relatively simple. Suppose we already have a DB node. In the initial state, its modulelist pointer is null, the memory block we pre-allocated to the module is called moduleblock (of the byte * type), and a moduleblockcur (of the int type) is used to indicate, where is the current Memory allocated? instblock and instblockcur are the same. When you need to append an inst to its modulelist, the function will do the following:
1)Determine whether the modulelist pointer is null. If it is null, it knows that this is the first data in the array. Then, divide the current position of moduleblock into an int space, assign zero to it, and then change the modulelist pointer. However, it points to the position after the int space.
The Code is as follows:
If (* pmodulelist = NULL) {// create an new module list
Int * length = (int *) (moduleblock + moduleblockcur );
* Length = 0;
Mcurdbmem-> moduleblockcur + = sizeof (INT );
M_assert (mcurdbmem-> moduleblockcur <= mcurdbmem-> moduleblocksize );
* Pmodulelist = (module **) (mcurdbmem-> moduleblock + mcurdbmem-> moduleblockcur );
}

In this way, the initialization function of modulelist is completed during the first allocation.

2)Use the modulelist pointer (after step 1, The modulelist pointer must be a valid address) to move up the length of a sizeof (INT) to get the length of the current array, add 1 to the value and divide the space of the sizeof (module *) length from the current position of the moduleblock as module *, which is equivalent to increasing the length of the array.

3)Allocate a module-type memory and direct the newly allocated module pointer to the module. In this way, the module is allocated and the list function is added.

The code for the above two steps is:
Int * length = (int *) (t_int8 *) * pmodulelist-sizeof (INT); // get the modulelist's length
Module ** ppmodule = (module **) (moduleblock + moduleblockcur); // get new memory address
Moduleblockcur + = sizeof (module *);
M_assert (moduleblockcur <= moduleblocksize );
* Length + = 1;
* Ppmodule = (module *) allocmem (sizeof (module); // allocate Module

The allocmem function can be seen as the same as malloc (), but for the system performance, we use our own method of managing the memory block list to process these scattered memories, I will not go into details here.

Through the above steps, all modules * can be allocated in a continuous memory, and the Inst in the module also uses a similar method to establish in a continuous memory, there will be no major loss in speed. After the development is complete, we deliver the module to the other party and pass various tests (of course, only on the Windows platform), proving that this approach is quite robust and efficient.

However, after a week of research, I realized that there were serious errors:
All of the above processes are based on windows. Since Windows is a 32-bit operating system, the Data Pointer length is 4 bytes, the length of the INT-type value is also 4 bytes, so the final modulelist I generated will be allocated in the memory (assuming that modulelist points to the memory address 0x0004 ):

It can be seen that the space occupied by the length field of the int type is exactly the same as that occupied by the module * field of the pointer type. This is just a coincidence. If the int type is not used as the length type, instead, we use the byte type. If the modulelist address is still 0x0004, the starting address of length will change to 0x0003. In Windows, this will not be wrong. Of course, this is due to the strong compatibility of the Windows platform.

Let's take a look at the memory distribution in Solaris. We use a 64-bit Solaris system. In this system, the int type is still 4 bytes in length, but the pointer length is 16 bytes (Note !!), Then after processing by the above program, the memory will become like this (or assume that the address of modulelist is 0x0004 ):

The memory distribution is indeed different (different memory addresses are marked in red), but this should be normal, at least I used to develop in windows, but I thought so .... However, we are sorry that such memory distribution will cause an alignment error on the Solaris system. To put it simply, the Solaris system must ensure efficient chip operations, all pointers must be within an integer multiple of 16 (unless otherwise specified). Otherwise, an access error occurs when you want to access the data pointed to by the pointer, in general, the space allocated by malloc In the Solaris system will be on an integer multiple of 16. The same is true for the moduleblock we allocated above, but when one block is dug out in the moduleblock, after the length is assigned to the int type, the address assigned to module * is obviously not an integer multiple of 16. At this time, we only use a forced type conversion, the pointer is directed to an Invalid Address for the Solaris system. Therefore, the program crashes when the code to access the memory ....

Once you know the problem, it is not difficult to solve the problem. We only need to ensure that module * is allocated to an integer multiple of the memory address of 16. Well, after modification, our program has made the following changes:
1)We set moduleblock to the module ** type. moduleblockcur is no longer used to indicate the memory offset, but to indicate how many modules are allocated * Space (these modifications are not related to functions, but the code is clearer. This is a refactoring ). Every time we need to allocate a new modulelist, we do not allocate an int-type space, but allocate a module * Type Space (module * Is pointer-type, on the Solaris platform, it is larger than 4 bytes of the int type. In Windows, it is exactly the same size as the int type ).
2)The previously used allocmem () function (although not detailed, but in fact, it will allocate all the scattered struct and char * strings required by the entire program with the same pre-prepared memory. The char * string is not necessarily long, the result of allocating memory to struct is that it may not be allocated to an integer multiple of the memory address of 16 when it is allocated to struct, which also leads to an alignment error. Therefore, for memory allocation of various struct types, another function allocstructmem () is used to allocate struct memory, because the Compact struct Compilation instruction is not used, so the length of each struct is also an integer multiple of 16, which will not affect the allocation of the next struct .) Changed to the allocstructmem () function.
3)The function is changed to a template so that multiple program segments can be called. The code is changed to the following:

Template <class type>
Type * allocgeneralmem (type ** pblock, size_t & pblockcur, size_t pblocksize, type ** & plist)
{
If (plist = NULL ){
Type ** pplength = pblock + pblockcur;
Memset (pplength, 0, sizeof (type *));
Pblockcur ++;

M_assert (pblockcur <= pblocksize );
Plist = pblock + pblockcur;
}

Char * pplength = (char *) plist)-sizeof (INT );
Int length;
Memcpy (& length, pplength, sizeof (INT ));
Length ++;
Memcpy (pplength, & length, sizeof (INT ));
Type ** ppdata = pblock + pblockcur;
Pblockcur ++;
M_assert (pblockcur <= pblocksize );
* Ppdata = (type *) allocstructmem (sizeof (type ));
Return * ppdata;
}

When allocating memory to the module:
Module * pmodule = allocgeneralmem (moduleblock, moduleblockcur, moduleblocksize, pmodulelist );
M_assert (pmodule );

All right, compile and run the program under Solaris, and the program will no longer crash. (at that time, after nearly four days of tossing, the program will not crash after seeing the program run for the first time, I was so excited that I almost fell out of the chair !!).
However, it seems that it is still incorrect because there should be images displayed in windows in the same test file, but nothing is displayed in Solaris. The only possibility is that when the DB memory data we generated was read by an external program, we did not find the proper data. Why? We had to check how the external program did it, in the code, I found two Macros in the data definition file that the other party gave us:
/* ===================================================== ============================================
* Dynamic list
*
* Struct members named "* List" point to a list of objects plus
* List length (before the first array element). All lists must be built up
* Like anylist.
* Eventually we shocould check the pointer offset:
* Assert (char *) null)-(char *) & (struct anylist *) null)-> entry) %
* Sizeof (INT) = 0)
* ===================================================== ==========================================
*/
Struct anylist {
Int length;
Void * entry [1];
};
# Define distinct ff (int *) null)-(int *) & (struct anylist *) null)-> entry )))
# Define zlistlength (list) (int *) (List) [0000ff])

Oh, my God! In fact, the other party has long considered Data Alignment. Therefore, after writing these two macros, there will be no platform difference in their use. Unfortunately, I have not found these two baby macros until now... Let me give a brief analysis:
Struct anylist, which has two fields: length and entry. length is actually the length data that we modulelist (instlist) points to the top of the memory. entry is the void pointer, because no Compilation instruction is used here, therefore, during compilation, the compiler will align each field in struct to the address that can be quickly accessed in memory. In Windows, the default value is 8 bytes, that is, length occupies 4 bytes, then there will be 4 bytes of free space, followed by another entry. In Solaris, the length is also 4 bytes, And the alignment mode is 16 bytes. Therefore, there will be 12 bytes of free space, followed by the entry. In this struct definition, there will be different sizes on different systems, and the distance between length and entry will be different.
The next macro: 0000ff, by forcibly converting the same data (null) to the length after struct anylist and the subtraction of the entry address space, you can know that in the current system, their spacing.
The last macro: zlistlength, you can find the address of the INT-type length through the offset calculated by debuff on the specified list, and retrieve the data, use this macro to calculate the list length.
In my code, where the list length needs to be calculated, it is written as follows:
Int * length = (int *) (byte *) * modulelist-sizeof (INT ));

Can you see the difference ?... Right. In Windows, the results of the two length obtaining methods are the same, but in Solaris, the result is as follows:

The orange address of my program is stored and the length is obtained, while the green address of the program of the other party is obtained, the result is of course that the program of the other Party thinks that the length is 0 (I cleared the entire memory, otherwise the obtained value will be uncertain ). Alas, I have to admire each other's designers.

Now, modify the Code as follows:
Template <class type>
Type * allocgeneralmem (type ** pblock, size_t & pblockcur, size_t pblocksize, type ** & plist)
{
If (plist = NULL ){
Pblockcur ++;
M_assert (pblockcur <= pblocksize );
Plist = pblock + pblockcur;
Int * plength = & (int *) (plist) [0000ff]);
* Plength = 0;
}

Int * plength = & (int *) (plist) [0000ff]);
M_assert (zlistlength (plist) = * plength );
* Plength = * plength + 1;
Type ** ppdata = pblock + pblockcur;
Pblockcur ++;
M_assert (pblockcur <= pblocksize );
* Ppdata = (type *) allocstructmem (sizeof (type ));
Return * ppdata;
}

Module * pmodule = allocgeneralmem (moduleblock, moduleblockcur, moduleblocksize, pmodulelist );
M_assert (pmodule );

Here, we also directly use the extenff macro above to obtain the address of the length. Compile and run on Solaris again .... The result is correct!

After a whole week of bug modification, I learned a lot about compilers and operating systems, and honed my skills in shell to play the complete program path with my eyes closed, indeed, I have gained a lot. Again, I realized that I had to learn too much technically ....

Welcome to graph software:
Http://www.tonixsoft.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.