Introduction to Apache POI
Apache POI is a set of Java APIs for accessing Microsoft Office format documents (Word, Excel, and PowerPoint). The API used to manipulate Excel format files is HSSF, and the API for manipulating Word format files is HWPF and the API for manipulating PowerPoint format files is HSLF.
POI's official website is http://poi.apache.org, the user can download the latest version 3.6 from here first, download after decompression has three jar Packages (Poi-3.6-20091214.jar,poi-contrib-3.6-20091214.jar and Poi-scratchpad-3.6-20091214.jar) copy the three jar packages to Lib of the Eclipse project directory, and then refresh the project to load the POI class library.
POI Main components
Poifs:poifs is the oldest and most stable part of the project, which supports both read and write capabilities, and all components ultimately depend on its definition.
Poifs for OLE 2 file operations: The foundation of Poifs is part of the oldest and most stable project. This is the pure Java implementation of our OLE 2 compound document format. It also supports both reading and writing functions. All of our components ultimately depend on its definition. For more information, see the POIFS project page.
HSSF for Excel file operations: HSSF is a pure Java implementation for document operations in Microsoft Excel 97 (-2003) file Format (BIFF8). It supports reading and writing skills. For more information, see the HSSF project page.
HWPF for Word file operations: HWPF is a pure Java interface for document operations in Microsoft Word 97 file format. This component is in the early stages of development and has limited ability to read and write Word documents, and simply reads and writes simple Word files For more information HWPF see the HWPF project page.
HSLF for PowerPoint file operations: HSLF is a pure Java interface for document operations in the Microsoft PowerPoint 97 (-2003) file format. It supports reading and writing skills. For more information, see the HSLF project page.
HDGF for Visio file operations: In addition POI provides a pure Java interface for HDGF document operations in the Microsoft Visio97 (-2003) file format. It currently only supports read operations, at a very low level, and only supports simple text extraction. For more information, see the HDGF project page.
HPSF Document properties: HPSF is the pure Java interface for OLE 2 formatting. A property set that is used primarily to store files (such as headings, authors, last modified dates, and so on) that they can use for a particular application purpose. For more information, see the HPSF project page.
Here's a quick introduction to the interfaces that are often used in projects to manipulate Excel and Word format files:
HSSF interface
At present POI more mature part is the HSSF interface, processing MS Excel (97-2003) object. It's not like we're just using CSV to generate unformatted things that can be converted by Excel, but real Excel objects, you can control some properties like Cell,sheet and so on. Of course, HSSF also has some drawbacks, such as not directly support the Excel chart, package and package dependencies are more complex and so on.
For the number of statistics pages (sheet), the HSSF interface can easily complete this function. Here's a brief introduction to the HSSF interface:
The objects that HSSF provides to us are in the Org.apache.poi.hssf.usermodel package, and the main parts include Excel objects, styles and formatting, and auxiliary operations. There are mainly the following types of objects:
Hssfworkbook: Document object corresponding to Excel
Hssfsheet: Forms that correspond to Excel
Hssfrow: Rows corresponding to Excel
Hssfcell: Grid unit corresponding to Excel
Hssffont: corresponding to Excel fonts
Hssfname: corresponding to Excel name
Hssfdataformat: corresponds to date format
Hssfheader: corresponding to Sheet head
Hssffooter: Corresponds to Sheet tail
Hssfcellstyle: Corresponds to the Cell style
Auxiliary operations include:
Hssfdateutil Date
Hssfprintsetup Printing
Hssferrorconstants Error Information table