Love and hate for JSON

Source: Internet
Author: User

This article recalls the love and hate of JSON. C + + is risky and needs to be used with caution.

The relevant code in this article is:

The test data is not inside, because this is used in the development of the project, it needs to be kept secret.

When working with JSON, read a simple introduction, simple syntax, easy to expand, it is very cool to use. Some of the interactions with the backend are also basically JSON. Third-party libraries are used in the project: Simple JSON, which is also a delight to use. Just ate the library's loss:

1. Forget that the parsed jsonvalue should be deleted;
The number in the 2.simple JSON is represented by a double, but the server gives a 64-bit integer. So I changed a little bit so that when parsing out the number of times to keep the original numbers literal. [using double to denote int is OK in general 53-bit integers. The approximate algorithm is to remove the 1-bit and exponent 11-bit mantissa of the sign in IEEE754.2 52 is 4503599627370496, précis-writers is 4.5* 10^15. References: and]
3. Be aware of the JSON's handling of the comma after the last element of the array;
The 4.json string itself is what the encoding, the library in parsing makes uxxxx the form of the character of the time is not OK. (Remember that there was a problem in the jsoncpp, and then used the value in base chromium.) Assuming that the problem is not discussed in advance with the server side and the other side, it might be a pit again. It was after the shaking of faith.

But none of these pits can shake my faith in json until one day my mentor gave me a 1685366-byte JSON data I spent more than 10 seconds (on a notebook) parsing with simple JSON, This makes it impossible to look directly at JSON for the next few years. Whenever someone else uses JSON, I say: "Brother has a 2M JSON data, can make your parsing speed very slow, more than 10 seconds oh." Whenever I encounter the need to deal with formatted content, I turn to the embrace of XML, and relatively familiar with is rapidxml. But Rapidxml also has a hole

1.rapidxml changes on the original document. Therefore, it is best to copy an input before parsing.
2. When formatting, very much is a pointer operation, you can not temporarily new to come out, or point to a temporary variable. And to specialize in opening a storage area, the need to put a string in the inside, after the format is complete and then unified recovery.
3. It is best to use sstream when formatting, otherwise the speed will be slower.
4. You need to catch an exception.
5. It is better to just deal with the UTF8 string. Suppose it is utf16, put a Chinese "one" on the node attribute, or text node, and try it immediately.

Until later, I was more lazy, like the HTTP header format. A key plus a ":" followed by value. Different kv with "\ r \ n" cutting. It's very fast to write a code that parses this data.


Until yesterday, but also old Luo and freedom after the first n days after the war, I think: I would also like to evaluate the JSON library. The battle is: Simplejson,jsoncpp,libjson,rapidjson. The challenge target is the legendary 1685366-byte JSON file. Only for file parsing.

A couple of libraries are bigger, in common: they're now like moving to GitHub. For example, when Simplejson moved to GitHub, I had only download down the code on master. The of its lair is file not found. Just after comparing the code, the code on master can be used directly as release. Several developers in the software maintainability of the Black Magic [reference:] less, but also relatively humane, very easy to compile. The slightly darker part is Libjson, You need to shut down the C interface yourself. Debug version number requires a macro to be opened. Jsoncpp is gray, need to use scons, but I did not succeed. Under the makefiles found the VSProject, upgrade can be used.

People lazy, write the test code is just barely able to use. Presumably a function reads a file, outputting UTF8 and UTF16 to two global variables. Defines a:
void simple_json_test (const char* UTF8, const wchar_t* UTF16);
void rapid_json_test (const char* UTF8, const wchar_t* UTF16);
void json_cpp_test (const char* UTF8, const wchar_t* UTF16);
void lib_json_test (const char* UTF8, const wchar_t* UTF16);
These functions are called in the main function. Timing with the clock () function. Each time a function is changed, the result is compiled again. The detailed test function body adds up to 11 lines, some code even the release of resources is too lazy to write.


The result enabled me to restore my faith in json, but I was not sure why the difference was so big before and after the Simplejson. Find out the Simplejson of the past and run once again the result is:

VS performance analysis did not look carefully, the simple code review found that the change is not big, but finally I found the tricky place.

There is a function in JSON.h:
Simple function to check a string ' s ' have at least ' n ' charactersstatic inline bool Simplejson_wcsnlen (const wchar_t *s , size_t N) {if (s = = 0) return false;const wchar_t *save = s;while (n--> 0) {if (* (save++) = = 0) return false;} return true;}

In JSON.cpp 181 lines, JSONValue.cpp 62, 70 lines are referenced, and the corresponding code in the old version number is:
We need 5 chars (4 hex + the ' u ') or its not validif (Wcslen (*data) < 5)//Is it a boolean?else if ((Wcslen (*data) & gt;= 4 && wcsncasecmp (*data, L "true", 4) = = 0) | | (Wcslen (*data) >= 5 && wcsncasecmp (*data, L "false", 5) = = 0)) Is it a null?else if (Wcslen (*data) >= 4 && wcsncasecmp (*data, L "null", 4) = = 0)

Let's say the issue of changing the old version number to the new version number is resolved. Of course, the question is what is also very obvious when you replace the control.

Love and hate for JSON

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.