Java POI writes large amounts of data into Excel

Source: Internet
Author: User
Tags throwable

Recently in the use of poi to write large amounts of data in Excel, found that excel2003 up to 65,535, a large number of data easily caused by oom, the internet looked up the API, found that for 2003, each sheet up to 65,535, if the amount of data is far more than 65535, It is recommended to sheet processing, and after poi3.8, there is sxssfworkbook, can support the large data volume write Excel operation, but currently only support excel2007

HSSF is a pure Java implementation of POI Engineering for Excel 97 (-2007) file operations
XSSF is a pure Java implementation of the POI Project for Excel OOXML (. xlsx) file operations

Starting with the POI 3.8 release, a low-memory-based API for XSSF is provided----SXSSF

SXSSF has a sliding window to limit access to the number of rows to reach a low memory footprint, and XSSF can access all lines. The old row data no longer appears in the sliding window and becomes inaccessible while writing to disk.
In auto-refresh mode, you can specify the number of Access row in the window to maintain a certain number of row in memory. When this quantity is reached, the new row data is generated in the window and the low-index data is moved from the window to the disk.
Alternatively, the number of rows in the sliding window can be set to automatically grow. It can be modified according to a definite flushrow (int keeprows) call based on the desired period.

SXSSF (streaming Usermodel API)

SXSSF (package:org.apache.poi.xssf.streaming) is a api-compatible streaming extension of XSSF to being used when very large Spreadsheets has to be produced, and the heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that is within a sliding window, while XSSF gives Access to all rows in the document. Older rows that is no longer in the window become inaccessible, as they is written to the disk.

You can specify the window size @ Workbook construction time via new Sxssfworkbook (int windowsize) or you can SE T it per-sheet via sxssfsheet#setrandomaccesswindowsize (int windowsize)

When a new row was created via CreateRow () and the total number of unflushed records would exceed the specified window size , then the row with the lowest index value was flushed and cannot be accessed via GetRow () anymore.

The default window size is A and defined by Sxssfworkbook.default_window_size.

A windowsize of-1 indicates unlimited access. In the This is records that has not been flushed by a call to Flushrows () is available for random access.

Note that SXSSF allocates temporary files, and must always clean up explicitly, by calling the Dispose method .

Sxssfworkbook defaults to using inline strings instead of a shared strings table. This was very efficient, since no document content needs to being kept in memory, but was also known to produce documents is incompatible with some clients. With GKFX strings enabled all unique strings in the document have to is kept in memory. Depending on your document content this could use a lot more resources than with GKFX strings disabled.

Carefully review your memory budget and compatibility needs before deciding whether to enable GKFX strings or not.

The example below writes a sheet with a window of $ rows. When the row was Count reaches 101, the row with rownum=0 was flushed to disk and removed from memory, when RowNum reaches 102 Then the row with Rownum=1 is flushed, etc.

Import Junit.framework.assert;import Org.apache.poi.ss.usermodel.cell;import Org.apache.poi.ss.usermodel.Row; Import Org.apache.poi.ss.usermodel.sheet;import Org.apache.poi.ss.usermodel.workbook;import    Org.apache.poi.ss.util.cellreference;import Org.apache.poi.xssf.streaming.SXSSFWorkbook; public static void Main (string[] args) throws Throwable {Sxssfworkbook wb = new Sxssfworkbook (+);//Keep RO        WS in memory, exceeding rows is flushed to disk Sheet sh = wb.createsheet ();            for (int rownum = 0; rownum < rownum++) {row row = Sh.createrow (rownum);                for (int cellnum = 0; cellnum < cellnum++) {cell cell = Row.createcell (Cellnum);                String address = new Cellreference (cell). formatasstring ();            Cell.setcellvalue (address); }}//Rows with rownum < is flushed and not accessible for (int rownum = 0; rownum < 900; rownum++) {Assert.assertnUll (Sh.getrow (rownum));//Call the GetRow method, write to disk, free memory}//ther last rows is still in memory for (int RowNum = 900; RowNum < 1000; rownum++) {//Assert.assertnotnull (Sh.getrow (rownum));//not called, reserved in memory} FileOutputStream out        = new FileOutputStream ("d:\\sxssf.xlsx");        Wb.write (out);        Out.close ();    Dispose of temporary files backing this workbook on disk wb.dispose (); }

The next example turns off Auto-flushing (windowsize=-1) and the code manually controls how portions of data is Writte N to disk

Import Org.apache.poi.ss.usermodel.cell;import Org.apache.poi.ss.usermodel.row;import Org.apache.poi.ss.usermodel.sheet;import Org.apache.poi.ss.usermodel.workbook;import    Org.apache.poi.ss.util.cellreference;import Org.apache.poi.xssf.streaming.SXSSFWorkbook; public static void Main (string[] args) throws Throwable {Sxssfworkbook wb = new Sxssfworkbook ( -1);//Turn off AUT        O-flushing and accumulate all rows in memory Sheet sh = wb.createsheet ();            for (int rownum = 0; rownum < rownum++) {row row = Sh.createrow (rownum);                for (int cellnum = 0; cellnum < cellnum++) {cell cell = Row.createcell (Cellnum);                String address = new Cellreference (cell). formatasstring ();            Cell.setcellvalue (address); }//Manually control how rows is flushed to disk if (rownum% = 0) {(Sxssfshee T) sh). flushrows (100);      Retain the rows and flush all others          ((sxssfsheet) SH). Flushrows () is a shortcut for ((sxssfsheet) SH). Flushrows (0),//This method F        Lushes All Rows}} fileoutputstream out = new FileOutputStream ("/temp/sxssf.xlsx");        Wb.write (out);        Out.close ();   Dispose of temporary files backing this workbook on disk wb.dispose (); }

SXSSF flushes sheet data in temporary files (a temp file per sheet) and the size of these temporary files can grow to a ve Ry large value. For example, a-MB CSV data, the size of the temp XML becomes more than a gigabyte. If the size of the temp files is a issue, you can tell SXSSF to use gzip compression:

  Sxssfworkbook wb = new Sxssfworkbook ();   Wb.setcompresstempfiles (TRUE); Temp files would be gzipped


The above content comes from API and personal summary, see source API http://poi.apache.org/spreadsheet/how-to.html

Java POI writes large amounts of data into Excel

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.