Large amount of data export Excel program __ Big Data

Source: Internet
Author: User
Tags access database

http://lqw.iteye.com/blog/525982

Test common conditions:
The total number of data is 110,011, each data bar number is 19 fields.
The computer is configured to: P4 2.67ghz,1g memory.

A comparison of POI, JXL, Fastexcel
POI, JXL, fastexcel are all open source projects for Java Third party open source Excel.

Export Scenario One: all at once exported to an Excel file.
The actual situation is reported outofmemery error, the following data is reported outofmemery data, data to the maximum number of data, as shown in table 1:
Table 1: Amount of data that can be processed when reporting outofmemery errors
Fastexecl POI JXL
10000 data/sheet 37465 28996 42270
5000 data/sheet 39096 31487 46270
3000 data/sheet 39000 32493 47860
Summary:
Sheet can reduce the use of memory to a certain extent, but all because the cell created in the program (that is, a cell in Excel) can not be released, consumes a lot of memory, resulting in outofmemery error; JXL is the best performing, with less memory to create.

Export Scenario Two: Export the data in multiple Excel files first, and then merge multiple Excel files.
First, test the time taken to export all the data, as shown in table 2, where the data is tested three times.
Table 2: Time spent exporting all data
Fastexecl POI JXL
10000 Data/file 68s 33s 30s
5000 Data/File 68s 32s 33s
3000 data/file 59s 33s 39s
Summary:
Excel files are exported successfully because an Excel file is exported and the memory used to create the cell is freed.
FASTEXECL performance is the worst, poi performance is stable, jxl with the increase of data, the speed of some degree increases quickly.

Then, for consolidation, the ability to combine multiple Excel into an Excel file is only poi, so using POI tests results as shown in table 3.
Note: The large amount of data combined will also report outofmemery error, so the total number of combined data is 50,000.
Table 3: Time spent merging 50,000 data
Time
10000 Data/File 11s
5000 Data/File 11s
3000 Data/File 11s
Summary:
Use POI to merge files faster, but with a limited amount of data.


Conclusion: Scenario two is more feasible, but the data quantity is limited, it is 50,000.


Second, the export of XML Spreadsheet
The exported format is similar to plain text, can store large amount of data, can realize sheet view, and can add simple style to meet project requirements. After the actual test Excel2003 and Excel2007 are able to identify and open the view normally. Use time tests as shown in table 4, the data are tested 3 times to average.
Table 4: Time spent generating all data
Time
10000 data/sheet 28 seconds
20000 Data/sheet 30.1 seconds
30000 Data/sheet 28.1 seconds
40000 data/sheet 26.5 seconds
50000 data/sheet 28.2 seconds
55000 Data/sheet 26.8 seconds
59000 Data/sheet 30.1 seconds
59500 data/sheet occurrence false deadlock phenomenon
60000 data/sheet occurrence false deadlock phenomenon

However, the exported data is XML is not a pure Excel file, such as using the XLS suffix of Excel file to save, open the file will pop-up warning, but does not affect reading.
As a practical test, the exported XML can be imported into an Access database in Access2007 and Access2003 by importing external data.

Third, summary
Project requirements are large amount of data export Excel files, POI, JXL, Fastexcel can not fully meet the requirements; Using XML Spreadsheet export to achieve large amount of data export, but the format of XML is not a pure Excel file, for the curve to save the nation. A comparison of the two export forms, as shown in table 5.
Table 5: Time spent merging 50,000 data
Spreadsheets for POI, JXL, Fastexcel XML
Export data format as plain execl file as XML file
Small amount of export data
Can the division sheet be able to
Can I add a style to
Can I add a picture POI can
Exporting data can be imported into access to

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.