Topic Center

Contact Sales

Home > Developer > Java

Java Performance optimizations: Correct parsing of JSON files

Last Update:2015-03-06 Source: Internet

Author: User

Tags gc overhead limit exceeded

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The data collection service averaged 1 hours in oom (Java.lang.OutOfMemoryError:GC overhead limit exceeded) and was found to be oom when the JSON Atom feed was downloaded for processing. The suspicion is that processing the feed memory spikes consumes too much to cause frequent full GC. Such as:

Analysis process

The service downloads 36 data files per 15 minutes from the feed server, including 12 17m,12 18M and 12 + 100 m files. The data format is JSON. Since the service loads the entire JSON file at once, it is then converted into a Java object. The memory consumption of this place may be larger, and some of these can be found through the following set of tests:

Test Preparation:

1 16M JSON data files and a 100M JSON data file.
Jackson2.3.4.final (JSON Parsing library)
Jdk1.6.0u30

Test method:

Parsing JSON files through the document Model API, statistics processing time and new memory size
Parse JSON file with streaming API, statistics processing time and new memory size

Test results:

Document Model
Streaming API
Mem Usage Chart for Document Model (17M JSON file)-The Minor GC take 3.024s, the full GC take 5.244s
Mem Usage Chart for streaming API (17M JSON file)-394 Minor GC take 78MS, Max GC take 557MS

Conclusion:

Download is a concurrent download by 5 threads, assuming that the files are around 100M, then at the same time the peak memory may reach 330mx5, about 1.5G. Has basically taken up 1/2 of the total process allocated.
Each oom occurs at about 14-20 points, this time is the data collection service processing data, a single process needs to process the number of devices is 180,000, if you start to download the feed at this time, will certainly appear oom.

Fixes and improvements

Carefully review the feed download and parsing process, found that the use of a full file load, based on the data in the previous table can be known that this method will occupy memory for a long time, and the source file half of the field is not required later. So the decision to adopt the new scheme is as follows:

Reduce the concurrent download feed thread, since the previous 5 changed to 2. Because the feed download and preprocessing is not a bottleneck, there is no need to open too many threads to handle the feed, resulting in a sharp increase in memory processing.
Take a streaming approach to JSON, pre-discard data that is not needed later, and then save the remaining data to the cache for post-processing use.

The memory consumption after the change is as follows:

Compared to the previous memory analysis diagram (the first image of the article), we can see that the total amount of the improvement has decreased, and the memory can be recovered quickly.

Java Performance optimizations: Correct parsing of JSON files

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

php json parsing parsing json in php tutorial parsing of data defer parsing of javascript correct spelling of extension correct use of ampersand correct use of forward slash

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Java Performance optimizations: Correct parsing of JSON files

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support