The Apache POI is a free, open-source, Cross-platform Java api,apache POI that provides APIs to Java programs for reading and writing to Microsoft Office format files.
Project download page: http://poi.apache.org/download.html
Apache POI is a Java API that creates and maintains operations that conform to the Office Open XML (OOXML) standard and Microsoft's OLE 2 Composite Document Format (OLE2). It allows you to use Java to read and create and modify MS Excel files. Also, you can use Java to read and create MS Word and mspowerpoint files. The Apache POI provides Java operations Excel solutions.
- HSSF-provides the ability to read and write Microsoft Excel xls format files.
- XSSF-provides the ability to read and write Microsoft Excel OOXML xlsx format files.
- HWPF-provides the ability to read and write Microsoft Word doc format files.
- HSLF-provides the ability to read and write Microsoft PowerPoint format files.
- HDGF-provides the ability to read Microsoft Visio format files.
- HPBF-provides the ability to read Microsoft Publisher format files.
- HSMF-provides the ability to read Microsoft Outlook format files.
Reading an Excel Document sample
we use the Hssfworkbook in POI to read the Excel data.
public void Test (file file) throws IOException {
InputStream INP = new FileInputStream (file);
Hssfworkbook workbook = new Hssfworkbook (INP);
Workbook ... Traversal Operation
}
Top code, reading Excel2003 (xls) files is fine, but once you read the Excel2007 (xlsx) file, you'll report an exception: "The supplied data appears to is in the Office 2007+ XML." You are calling the "part of POI" deals with OLE2 Office Documents. You are need to called a different part of the POI to process this data (eg XSSF instead of HSSF) "
Looking up the data, the Excel2007 version of the Excel file needs to be read using Xssfworkbook, as follows:
public void Test (file file) throws IOException {
InputStream INP = new FileInputStream (file);
Xssfworkbook workbook = new Xssfworkbook (INP);
Workbook ... Traversal Operation
}
Note: Xssfworkbook requires additional import of Poi-ooxml-3.9-sources.jar and Poi-ooxml-schemas-3.9.jar.
In this way, Excel2007 import is no problem, but the import Excel2003 also reported an exception.
So, when you import Excel, try to decide which version of Excel to import, and call a different method.
I thought about using a file suffix name to determine the type, but if someone xlsx the suffix to xls, if you use the XLSX function to read, the result is an error; Although the suffix name is correct, the file content encoding is incorrect.
Finally, Workbookfactory.create (InputStream) in Poi-ooxml is recommended to create workbook because Hssfworkbook and Xssfworkbook implement the workbook interface. The code is as follows:
Workbook WB = Workbookfactory.create (IS);
It is conceivable that in the workbookfactory.create () function, there must be a file type of judgment, together to see how the source is judged:
/**
* Creates the appropriate hssfworkbook/xssfworkbook from
* given InputStream.
* Your input stream must either support Mark/reset, or
* be wrapped as a {@link pushbackinputstream}!
*
/public static workbook create (InputStream inp) throws IOException, invalidformatexception {
//If clearly does N ' t do mark/reset, wrap up
if (! inp.marksupported ()) {
INP = new Pushbackinputstream (INP, 8);
}
if (Poifsfilesystem.haspoifsheader (INP)) {return
new Hssfworkbook (INP);
}
if (Poixmldocument.hasooxmlheader (INP)) {return
new Xssfworkbook (Opcpackage.open (INP));
}
throw new IllegalArgumentException ("Your InputStream was neither a OLE2 stream, nor an OOXML stream");
As you can see, there are appropriate workbook objects created according to the file type. is based on the file's head information to compare to judge, at this time, even if the suffix renamed, still the same pass.