Java uses XML to export large volumes of data to excel

Source: Internet
Author: User

In Java Development, I often encounter requirements for exporting database data to excel. For example, in my project, the customer requires that all query results be exported to excel, this is easy to implement for a small amount of data (tens of thousands of records), but for a large amount of data (hundreds of thousands or more, depending on the number and length of exported fields ), it is troublesome to query databases and generate EXCEL files.

 

Take a table in my actual project as an example. There are more than 40 fields in the Oracle 10 Gb database, the content is personal information such as name, age, gender, address, birthday, occupation, and work unit. More than 20 fields must be exported. If the table is exported as HTML, the data is placed in <Table> <tr> <TD>, and the size of 10 thousand data entries is about 7.5-8 Mb.

 

You can use either of the following methods to query a database:

 

1. Use paging query. The size of each page is determined based on the database server performance and database operation frequency. For example, the size of each page ranges from 5000 to 10000 rows. One page is queried each time, and an Excel file is generated. The disadvantage of this method is that the database needs to be queried multiple times, and the temporary table is required for paging query of the database, which is quite performance-consuming.

 

2. Use an intermediate table. Put the query results in the intermediate table at one time. You can use "Select... into..." and then extract some of the results from the intermediate table to generate an Excel file. You can delete a part or mark it as "exported ". This method requires operations on the intermediate table, which is troublesome but efficient.

 

 

Below are several ways to generate an Excel file (Office 2000 or above ):

 

1. Use the poi or jxl component to generate an Excel file. This method can control the content format. The disadvantage is that the Excel files are loaded into the memory for processing. When you append data to the Excel file, you also need to load all the original files first. When the data volume is large, the generated Excel files will be large, which may cause memory overflow on the Web Server due to too much memory.

 

Second, use the excelfile to support the HTML format function, that is, change the suffix of the HTML file to .xls, you can open it in Excel. This is the simplest example. In jsp, you only need to change the HTTP header so that the client can download the generated. xls file, instead of allowing the browser to open the HTML file. The JSP file code is as follows:

<% @ Page Language = "Java" contenttype = "text/html; charset = GBK" %> <% string filename = "export_file_name.xls"; // save the file name response in the download dialog box. setheader ("content-disposition", "attachment; filename =" + filename); %> <HTML xmlns: O = "urn: Schemas-Microsoft-com: Office: office "xmlns: x =" urn: Schemas-Microsoft-com: Office: excel "xmlns =" http://www.w3.org/TR/REC-html40 "> 

Note that the above <HTML> tag, coupled with the Microsoft Office namespace, mainly deals with Excel format problems, such as the ID number "440957198402125468", will be considered as a number in Excel, convert to scientific notation such as "4.41e + 17. Therefore, you need to control the format. The simplest way to convert the format is to add a single quotation mark (') before the string, so that it will not be automatically converted in Excel, and the same is true for the time format. The single quotation mark is implemented by adding the attribute X: STR to the cell in the Table: <td x: Str = "'Number or time string content"> </TD>: After this attribute is added, no additional content is required in the cell, which reduces the source code of the generated xls file.

 

In this way, when JSP is used to generate and download the xls file, the JSP page loads the content to the memory at one time. If the data volume is large, the generated file is too large, the Web server memory overflow may throw an outofmemory exception. You can use another method. When a file is generated, it is first written to the server file system. When the file is written, the stream is used to continuously append the file and then download the file.

 

However, there is another problem in this method, that is, only one worksheet can be generated, while a worksheet in Excel can only have up to 65535 rows of data. When there is a large amount of data, more sheet needs to be created, I have not found a method to add multiple sheet in HTML.

 

3. generate xml files that comply with the XML specifications of Excel. In the second example, the generated extension is. xls, the content is XML format text, where XML must fully comply with the Excel XML specification (Excel XML Schema/DTD, the relevant content in the http://msdn.microsoft.com/en-us/library/aa140062 (office.10 ). aspx). Otherwise, the Excel file cannot be opened. In the form of a dataoutstream stream, the content in XML format is first written to the server file system, and then downloaded. If the data volume is large, multiple sheets are allowed. The XML format is as follows:

<?xml version="1.0" encoding="GBK" ?><?mso-application progid="Excel.Sheet"?><Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">  <Version>11.9999</Version> </DocumentProperties> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">  <WindowHeight>10005</WindowHeight>  <WindowWidth>10005</WindowWidth>  <WindowTopX>120</WindowTopX>  <WindowTopY>135</WindowTopY>  <ActiveSheet>1</ActiveSheet>  <ProtectStructure>False</ProtectStructure>  <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <Styles>  <Style ss:ID="Default" ss:Name="Normal">   <Alignment ss:Vertical="Center"/>   <Borders/>   <Font ss:FontName="ËÎÌå" x:CharSet="134" ss:Size="12"/>   <Interior/>   <NumberFormat/>   <Protection/>  </Style>  <Style ss:ID="s23">   <Font ss:FontName="ËÎÌå" x:CharSet="134" ss:Size="12" ss:Color="#0000FF"    ss:Bold="1"/>   <Interior ss:Color="#C0C0C0" ss:Pattern="Solid"/>  </Style> </Styles> <Worksheet ss:Name="Sheet1">  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="3" x:FullColumns="1"   x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="16.5">   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="Number">1</Data></Cell>    <Cell><Data ss:Type="String">aaa</Data></Cell>   </Row>   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="Number">2</Data></Cell>    <Cell><Data ss:Type="String">bbb</Data></Cell>   </Row>   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="Number">3</Data></Cell>    <Cell><Data ss:Type="String">ccc</Data></Cell>   </Row>  </Table>  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">   <Unsynced/>   <ProtectObjects>False</ProtectObjects>   <ProtectScenarios>False</ProtectScenarios>  </WorksheetOptions> </Worksheet> <Worksheet ss:Name="Sheet2">  <Table ss:ExpandedColumnCount="4" ss:ExpandedRowCount="4" x:FullColumns="1"   x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="14.25">   <Column ss:AutoFitWidth="0" ss:Width="73.5"/>   <Column ss:AutoFitWidth="0" ss:Width="119.25"/>   <Column ss:AutoFitWidth="0" ss:Width="89.25"/>   <Column ss:AutoFitWidth="0" ss:Width="141"/>   <Row>    <Cell ss:StyleID="s23"><Data ss:Type="String">aaaa</Data></Cell>    <Cell ss:StyleID="s23"><Data ss:Type="String">bbbb</Data></Cell>    <Cell ss:StyleID="s23"><Data ss:Type="String">cccc</Data></Cell>    <Cell ss:StyleID="s23"><Data ss:Type="String">dddd</Data></Cell>   </Row>   <Row>    <Cell><Data ss:Type="String">aaaa</Data></Cell>    <Cell><Data ss:Type="String">bbbb</Data></Cell>    <Cell><Data ss:Type="String">cccc</Data></Cell>    <Cell><Data ss:Type="String">dddd</Data></Cell>   </Row>   <Row>    <Cell><Data ss:Type="String">aaaa</Data></Cell>    <Cell><Data ss:Type="String">bbbb</Data></Cell>    <Cell><Data ss:Type="String">cccc</Data></Cell>    <Cell><Data ss:Type="String">dddd</Data></Cell>   </Row>   <Row>    <Cell><Data ss:Type="String">aaaa</Data></Cell>    <Cell><Data ss:Type="String">bbbb</Data></Cell>    <Cell><Data ss:Type="String">cccc</Data></Cell>    <Cell><Data ss:Type="String">dddd</Data></Cell>   </Row>  </Table>  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">   <Print>    <ValidPrinterInfo/>    <PaperSizeIndex>9</PaperSizeIndex>    <HorizontalResolution>600</HorizontalResolution>    <VerticalResolution>600</VerticalResolution>   </Print>   <Selected/>   <ProtectObjects>False</ProtectObjects>   <ProtectScenarios>False</ProtectScenarios>  </WorksheetOptions> </Worksheet> <Worksheet ss:Name="Sheet3">  <Table ss:ExpandedColumnCount="0" ss:ExpandedRowCount="0" x:FullColumns="1"   x:FullRows="1" ss:DefaultColumnWidth="54" ss:DefaultRowHeight="16.5"/>  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">   <Unsynced/>   <ProtectObjects>False</ProtectObjects>   <ProtectScenarios>False</ProtectScenarios>  </WorksheetOptions> </Worksheet></Workbook>

Note: The xml header and namespace must be correct. The workbook is the root tag, the worksheet is a worksheet, the table is the table in the worksheet, the row is a row in the table, and the cell is the cell, data is the data in cells and can be in different formats, such as string and number. Styles is similar to HTML style, which can be referenced in the following rows and cells after definition. For other tags and attributes, see the Microsoft csdn document. During output, you can append worksheet, table, row, and cell to the target file.

 SS: expandedrowcount = "3" note that WPS is not required, and the number of rows in the office must be the same as that in the current sheet. Number starting from 1, which can exceed.

This method is better than the second method because it uses stream output and does not need to be loaded into the memory at a time, so it is more efficient than poi and jxl and can generate multiple sheets, can be used to export large volumes of data.

 

The above methods can be used according to the actual situation. In addition, there are other XML methods, such as combining XSLT and mime ing, which are more complex and not studied yet.

From: http://blog.csdn.net/yrsheng/article/details/4100393

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.