C-intermediate data serialization: simple use and discussion (2); Data serialization Discussion
Introduction-a better way
In fact, no matter what language, the development framework will encounter serialization problems. serialization can be understood as A protocol for interaction between A and B.
A long time ago, the printf and scanf protocols were used to implement a set of serialization problems.
Simple use and Discussion of C basic data serialization
This article uses a new attempt based on the above. The specific idea is to use the compiler's unified memory encoding method for the structure.
The specific implementation is to set the encoding method of the struct compiler by entering the stack macro # pragma pack (push, 1)... # pragma pack (pop.
#pragma pack(push, 1)struct person { int id; char sex; int age; char name[65]; double high; double weight;};#pragma pack(pop)
When the compiler parses struct person, it uses 1-byte alignment to ensure that the binary data after structure encoding and resolution is the same (VS and GCC tested ).
This is one of the constants of communication between different systems. Through this serialization idea, we may wish to design a verification Demo as follows:
Window producer code
// Set the data, write it to the test file, and then read struct person per = {1, 1, 19, "simplec Wang Zhi", 179.0, 70.1 }; // write data to the file. const char * path = "person.txt"; FILE * txt = fopen (path, "wb"); if (NULL = txt) exit (EXIT_FAILURE); fwrite (& per, sizeof (struct person), 1, txt); fclose (txt );
Linux User code
const char * path = "person.txt"; FILE * txt = fopen(path, "rb"); if (NULL == txt) exit(EXIT_FAILURE); struct person np; fread(&np, sizeof(struct person), 1, txt); printf("[%d, %d, %d, %s, %lf, %lf]\n", np.id, np.sex, np.age, np.name, np.high, np.weight); fclose(txt);
The actual running result is displayed.
I don't know if you are curious. Why is the final result incorrect? This is the point where all the old birds in programming and development must face it. "Uniform encoding ".
To better understand the problem, try Demo. sizeofname. c.
# Include <stdio. h> # include <wchar. h>
Int main (int argc, char * argv []) {// default system encoding, A total of 2 + 3 + 7 + 1 = 13 characters char as [] = "Wang Zhi-simplec"; // use the width byte, 2 bytes represents a character wchar_t bs [] = L "Wang Zhi-simplec"; // uses UTF-8 encoding, important ☆char cs [] = u8 "Wang Zhi-simplec "; printf ("sizeof as = % zu. \ n ", sizeof as); printf (" sizeof bs = % zu. \ n ", sizeof bs); printf (" sizeof cs = % zu. \ n ", sizeof cs); return 0 ;}
The running result on window is as follows. By default, my system is gbk encoding (expanded ascii Code). If the English version of window is installed, the default value is UTF-8.
In my VS settings, the default format is unix UTF-8 with BOM file encoding format. For detailed configuration, refer to this blog-Visual Studio saves it as UTF8 encoding by default.
The test result on linux is as follows, linux default UTF-8 Encoding
As you can see from the above, the main problem is that inconsistent encoding leads to garbled code, which eventually leads to an error in fread parsing. Then we will start to solve this problem.
Preface Summary
1. # pragma pack (push, 1)... # pragma pack (pop) is an efficient method for direct serialization between C/C ++ systems.
2. Recommended unified use of UTF-8 system code. linux default is, Windows Chinese version is gbk.
Next we will propose a uniform encoding scheme. Here we can basically do it. Later we can choose. O (encoding _ encoding) O Haha ~
Preface-I need some help.
In the past, there was a GNU libiconv cross-platform library that could solve the encoding problem on different platforms. Its latest version does not provide direct support for Windows.
Here, I pulled it to the window and made a pass, and finally generated libiconv. lib. For details, refer to the following project.
Libiconv-for-window https://github.com/wangzhione/libiconv-for-window
The detailed configuration steps of the project are as follows:
========================================================== ================================ Static Library: libiconv for window project overview ==================================== ================================================================== /// //////////////////////////////////////// /// // The current porting project is based on the GNU project libiconv-1.15 http://www.gnu.org/software/libiconv/ Porting to Windows 10 14393.953 | Visual Studio 2017 project initiator: simplec-wz | wangzhione@163.com /////////////////////////////////// //////////////////////////////////////// // specific operation logic: 1. download the libiconv package from the official website and decompress [xxx = detailed path after decompression] 2. under the $ (ProjectDir) project directory, create the include directory 2.1 to include xxx/include/iconv In the compressed package. h. build. copy in to the include directory and rename it iconv. h 2.2 convert xxx/onfig. h. copy in to include and name it config. h 2.3 put all the values under xxx/lib *. h and *. copy the def file to the include directory and set xxx/libcharset/include/localcharset. h. build. copy in to the include directory and rename it 3. set xxx/libcharset/lib/localcharset. c copy to the $ (ProjectDir) Directory 4. set xxx/lib/iconv. copy c to the $ (ProjectDir) directory. set localcharset. c iconv. c iconv. h localcharset. h config. h. Add it to the project. 6. VC ++ directory-> include directory add [$ (ProjectDir) include] into 7. c/C ++-> pre-processor-> add _ CRT_SECURE_NO_WARNINGS remove insecure calls 8. general-> Generate the target name-> change to libiconv. In Debug, perform the following steps to compile and modify libiconvd: 1. iconv. h. Modify row 1.1 25-29 and delete invalid macro 1.2 55-61 and delete all LIBICONV_DLL_EXPORTED following row 1.3, you can replace all with 1.4 to delete all the following @ ICONV_CONST @ 1.4.1 global Delete ICONV_CONST 1.5 and then delete @ xxx @ large segment 1.6. For details, refer to my final file base version 1.7 to encode this file. changed to UTF-8 with BOM mode, I used NotePad ++ for conversion. localcharset. c. Modify 2.1 79-83 and delete 3. localcharset. h. Modify 3.1 20-26 rows. Delete 3.2 31 rows. Invalid macros. Delete 4. config. h. Modify row 4.1 28-30 to delete and return to EILSEQ 5. solve hundreds of warnings 5.1 I can use this code to do 1.15 windows lib Library source code project set /////////////////////// //////////////////////////////////////// //////////////
Through the above operations, basically the libiconv project on the window is almost done. The following is a simple example. iconv has three common interfaces:
/* Allocates descriptor for code conversion from encoding ‘fromcode’ to encoding ‘tocode’. */extern iconv_t iconv_open (const char* tocode, const char* fromcode);/* Converts, using conversion descriptor ‘cd’, at most ‘*inbytesleft’ bytes starting at ‘*inbuf’, writing at most ‘*outbytesleft’ bytes starting at ‘*outbuf’. Decrements ‘*inbytesleft’ and increments ‘*inbuf’ by the same amount. Decrements ‘*outbytesleft’ and increments ‘*outbuf’ by the same amount. */extern size_t iconv (iconv_t cd, char* * inbuf, size_t *inbytesleft, char* * outbuf, size_t *outbytesleft);/* Frees resources allocated for conversion descriptor ‘cd’. */extern int iconv_close (iconv_t cd);
The above functions are explained in detail in English. For more details, see coding implementation details.
Another point is that iconv outbytesleft outputs the number of encoded bytes that have been converted by the inbuf interface.
To be honest, the interface design provided by iconv linux is ugly and disgusting. The soft interruption of linux signal is really a rape of our code.
The following also provides a very nice sciconv. h help Interface
# Ifndef _ H_SIMPLEC_SCICONV # define _ H_SIMPLEC_SCICONV # include <iconv. h> # include <stdbool. h> /// iconv for window helper // by simplec wz // si_isutf8-determine whether the current string is UTF-8 encoded // in: the string to be tested // return: true indicates utf8 encoding. false is not // extern bool si_isutf8 (const char * in); // si_iconv-transcodes the string in, from code-> to code // in: string to be transcoded // len: character array length // from: initial encoding string // to: converted encoding string // rlen: returns the length of the converted string. If NULL is input, no // return is required. The transcoded string is returned. Destroy the string by yourself. // extern char * si_iconv (const char * in, const char * from, const char * to, size_t * rlen); // si_iconv-transcode the string array in and put it in the in array. // in: character array // from: initial encoded string // to: converted encoded string // return: void // extern void si_aconv (char in [], const char * from, const char * to); // si_gbktoutf8-convert the string array in to utf8 encoding // in: character array // len: character array length // return: void // extern void si_gbktoutf8 (char in []); // si_utf8togbk-convert the string array in to gbk encoding // in: character array // len: character array length // return: void // extern void si_utf8togbk (char in []); # endif
It facilitates development. Everything is ready here. Let's get started.
Body-solves the problem of internationalization (encoding)
The library problem has been solved above. Let's start a demo and try it. Let's use the window producer code above.
# Include <stdio. h>
# Include <stdlib. h>
# Include <sciconv. h>
# Pragma pack (push, 1)
Struct person {
Int id;
Char sex;
Int age;
Char name [65];
Double high;
Double weight;
};
# Pragma pack (pop)
//
// New idea for testing data serialization, bit alignment
//
Int main (int argc, char * argv []) {
// Set the data, write it to the test file, and then read it.
Struct person per = {
1, 1, 19, "simplec Wang Zhi", 179.0, 70.1
};
// Print the data
Printf ("[% d, % s, % lf, % lf] \ n ",
Per. id, per. sex, per. age, per. name, per. high, per. weight );
// Write data to the file.
Const char * path = "person.txt ";
FILE * txt = fopen (path, "wb + ");
If (NULL = txt)
Exit (EXIT_FAILURE );
Si_gbktoutf8 (per. name );
Fwrite (& per, sizeof (struct person), 1, txt );
Fclose (txt );
Return EXIT_SUCCESS;
}
Upload person.txt to linux and test it. The detailed code for testing linux is as follows: personread. c
# Include <stdio. h> # include <stdlib. h> # include <iconv. h ># pragma pack (push, 1) struct person {int id; char sex; int age; char name [65]; double high; double weight ;}; # pragma pack (pop) // new idea for testing data serialization, using bit alignment // int main (int argc, char * argv []) {struct person np; // write data to the file. const char * path = "person.txt"; FILE * txt = fopen (path, "rb"); if (NULL = txt) exit (EXIT_FAILURE); fread (& np, sizeof (struct person), 1, txt); fclose (txt); // print the data printf ("[% d, % s, % lf, % lf] \ n ", np. id, np. sex, np. age, np. name, np. high, np. weight); return EXIT_SUCCESS ;}
The final experiment results are all normal, ouye OY
The solution content has been settled here. If you are interested, you can take a look at the above libiconv for window github source code.
Postscript-next step
Errors are inevitable. Here we mostly discuss and communicate with each other. After all, we use all foreign protocols and standards. Unfortunately, it is a pity that Haha (* too many). Thank you for your correction.
Singing and smiling-http://music.163.com/#/m/song? Id = 395677 & userid = 16529894