Java notes-How to efficiently bulk write millions data to an Excel form

Source: Internet
Author: User

Today, a friend asked me to use Java there is no way to export millions data to Excel worksheets.

My first thought was that it was really a crazy idea. Then I think if there really is such a demand, what should I do?

  

PS: First, the basic knowledge of science

  Versions of Excel 2003 and below . One table supports 65536 row data, 256 columns. This means that excel2003 is completely unlikely to meet the needs of millions of data exports.

Excel 2007-2010 version. One table maximum support 1048576 row, 16384 columns;

I am using Office 2010, the higher version of the author has not been used, temporarily unable to judge.

Thus it seems that the amount of data millions to excel itself is already close to the limit of the extent.

What if we have a bigger demand?

Since the single-table support is the largest 104w data, then the larger demand we can only through the program-level table Operation in a way to achieve. O (∩_∩) o haha ~

For the class library that operates Excel. I actually know not a lot. It was a very good feeling that I used the POI library a long way ago. and decided to start with it. See if POI has any more effective and better solution. Because the author previously used the POI version is relatively low. It is also used in the Excel 2003 version. So I met a lot of problems.

  

Editor: Intellij Idea 13.2

Class Library Requirements: poi-3.10-final

1 <Dependency>2             <groupId>Org.apache.poi</groupId>3             <Artifactid>Poi</Artifactid>4             <version>3.10-final</version>5         </Dependency>

Create a new MAVEN project.

According to the author's previous experience, I wrote a code directly using POI. The execution of the time the direct error.

1  Public Static voidExcel2003operate (String FilePath)throwsException {2Hssfworkbook Hssfworkbook =NewHssfworkbook (NewFileInputStream (NewFile (FilePath ));3Hssfsheet sheet = hssfworkbook.getsheetat (0);4          for(inti = 0; I < 10000; i++) {5Hssfrow Hssfrow =Sheet.createrow (i);6              for(intj = 0; J < 10; J + +) {7 Hssfcellutil.createcell (Hssfrow, J, String.valueof (Math.random ()));8             }9         }TenFileOutputStream out =NewFileOutputStream ("Workbook.xlsx"); One Hssfworkbook.write (out); A out.close (); -}
1Connected to the target VM, address: ' 127.0.0.1:62382 ', Transport: ' Socket '2Exception in thread "main" Org.apache.poi.poifs.filesystem.OfficeXmlFileException:The supplied data appears to IS in the Office 2007+ XML. You is calling the part of the POI this deals with OLE2 Office Documents. You need to call a different part of the POI to process Thisdata (eg XSSF instead of HSSF)3At Org.apache.poi.poifs.storage.headerblock.<init> (headerblock.java:131)4At Org.apache.poi.poifs.storage.headerblock.<init> (headerblock.java:104)5At Org.apache.poi.poifs.filesystem.poifsfilesystem.<init> (poifsfilesystem.java:128)6At Org.apache.poi.hssf.usermodel.hssfworkbook.<init> (hssfworkbook.java:342)7At Org.apache.poi.hssf.usermodel.hssfworkbook.<init> (hssfworkbook.java:323)8At Dev.tinyz.excel.POIUtil.Excel2003Operate (poiutil.java:23)9At Dev.tinyz.excel.Main.main (main.java:16)TenDisconnected from the target VM, address: ' 127.0.0.1:62382 ', Transport: ' Socket '

Run a direct error. Carefully read the error message after the discovery. Poi to manipulate versions of Excel 2007 and above need to use XSSF instead of the HSSF of the above code.

  

found that the class library actually does not have XSSF related classes. The author dumbfounded said. So go to the POI website to view. Discover that the complete POI class library contains a lot of content. So I learned a little bit about the specific role of each part:

Poi-ooxml and Poi-ooxml-schemas are POI's extensions to the 2007 and above versions. This increases the number of MAVEN dependencies:

1 <Dependency>2             <groupId>Org.apache.poi</groupId>3             <Artifactid>Poi-ooxml</Artifactid>4             <version>3.10-final</version>5         </Dependency>6         <Dependency>7             <groupId>Org.apache.poi</groupId>8             <Artifactid>Poi-ooxml-schemas</Artifactid>9             <version>3.10-final</version>Ten         </Dependency>

Quickly revise your code. The support for Excel 2010 version is implemented. There is a feeling of being accomplished in a moment, there is wood. O (∩_∩) o haha ~. A good sense of accomplishment says.

1  Public Static voidExcel2007aboveoperateold (String FilePath)throwsIOException {2Xssfworkbook Workbook =NewXssfworkbook (NewFileInputStream (NewFile (FilePath ));3         //get the first form4Sheet first = Workbook.getsheetat (0);5          for(inti = 0; I < 100000; i++) {6Row row =First.createrow (i);7              for(intj = 0; J < 11; J + +) {8                 if(i = = 0) {9                     //First lineTenRow.createcell (j). Setcellvalue ("column" +j); One}Else { A                     //Data -                     if(j = = 0) { - Cellutil.createcell (Row, J, string.valueof (i)); the}Else - Cellutil.createcell (Row, J, String.valueof (Math.random ())); -                 } -             } +         } -         //Write File +FileOutputStream out =NewFileOutputStream ("Workbook.xlsx"); A Workbook.write (out); at out.close (); -}

Run quickly running up. The first Test writes 1w data. Takes more than 8 seconds. Feel write speed is slow, 1w 8 seconds, 100w. Oh, my God. This efficiency is totally unacceptable. So test 10w, see if the test is really slow write speed. The test results are a crash.

1 Cast time:49699

Testing the export of 10w data to Excel takes nearly 50 seconds. So this way was temporarily abandoned. The sense of accomplishment was knocked down in a moment.

  

Return to POI's official website again. Http://poi.apache.org/spreadsheet/index.html

The official mentions that since the POI3.8 version has begun to provide a SXSSF way to operate on very large data volumes. So...

Original:

SXSSF is a api-compatible streaming extension of XSSF to being used when very large spreadsheets has to be produced ...

  

Move it now. Change the code. The code is as follows:

1  Public Static voidExcel2007aboveoperate (String FilePath)throwsIOException {2Xssfworkbook Workbook1 =NewXssfworkbook (NewFileInputStream (NewFile (FilePath ));3Sxssfworkbook Sxssfworkbook =NewSxssfworkbook (WORKBOOK1, 100);4 //Workbook Workbook = workbookfactory.create (new FileInputStream (New File (FilePath)));5Sheet first = Sxssfworkbook.getsheetat (0);6          for(inti = 0; I < 100000; i++) {7Row row =First.createrow (i);8              for(intj = 0; J < 11; J + +) {9                 if(i = = 0) {Ten                     //First line OneRow.createcell (j). Setcellvalue ("column" +j); A}Else { -                     //Data -                     if(j = = 0) { the Cellutil.createcell (Row, J, string.valueof (i)); -}Else - Cellutil.createcell (Row, J, String.valueof (Math.random ())); -                 } +             } -         } +FileOutputStream out =NewFileOutputStream ("Workbook.xlsx"); A Sxssfworkbook.write (out); at out.close (); -}

Run the tests multiple times. View data

1 Cast time:11604

See the moment the data feels, wow. Good for the power to say. Incredibly short from nearly 50 seconds with 11 seconds ...

Why is the code gap so big?

Originally, SXSSF implemented a set of automatic data-brushing mechanism. When the amount of data reaches a certain level (users can set this limit themselves). Brush part of the data like text. This alleviates the pressure of the program when it is running. Achieve a high efficiency goal. O (∩_∩) o haha ~

Once again, test a single table to write 100w data.

1 Cast time:87782

Write 100w data in almost 90 seconds.   O (∩_∩) o haha ~. Although it still looks a little slow. But considering the amount of data that is so time-consuming, it is acceptable to think. 100w data generated by the Excel form actually has 136MB. It took a lot of time to open this document. Ha ha

Bask in your achievements:

  

  

SOURCE Download: Http://pan.baidu.com/s/1bnw9pYB

I have limited ability. For the time being, a relatively efficient bulk write is implemented using the POI class library only. If there is a better class library or method of friends. Welcome message to share. Thank you for your advice. O (∩_∩) o haha ~

Author : Tinyz
Source: http://www.cnblogs.com/zou90512/
About study hard, day up. Constantly explore learning, enhance their own value. Record experience sharing.
This article is copyrighted by the author and the blog Park, welcome reprint, but without the consent of the author must retain this statement, and in the article page obvious location to give the original link
If you have any questions, you can contact me by [email protected] , thank you very much.
Author shop: Http://aoleitaisen.taobao.com. Welcome to the masses of readers onlookers

Java notes-How to efficiently bulk write millions data to an Excel form

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.