Java parses the Office documents

Source: Internet
Author: User

Refer:

Http://danadler.com/jacob/
Http://jakarta.apache.org/poi/
Http://www.onjava.com/pub/a/onjava/2003/01/22/poi.html
Http://www.csdn.net/develop/article/15/15311.shtm
Http://forum.java.sun.com/thread.jsp? Forum = 40 & Thread = 382666 & tstart = 0 & trange = 15

Java Excel API documentation

Http://www.andykhan.com/jexcelapi/

1. An example of Jacob's word operations. Other operations such as Excel and PDF are available in the sample.

Import java. Io. file;
Import com.jacb.com .*;
Import com. Jacob. ActiveX .*;
Public class wordtest {
  
Public static void main (string [] ARGs ){
Wordbean word = new wordbean ();
Word. openword (true );
Word. createnewdocument ();
Word. inserttext ("Hello word .");
}
}

Import com. Jacob. ActiveX .*;
Import com.jacb.com .*;
Public class wordbean extends java. AWT. Panel
{
Private activexcomponent mswordapp = NULL;
Private dispatch document = NULL;
Public wordbean ()
{
Super ();
}
Public void openword (Boolean makevisible)
{
// Open word if we 've not done it already
If (mswordapp = NULL)
{
Mswordapp = new activexcomponent ("word. application ");
}
// Set the visible property as required.
Dispatch. Put (mswordapp, "visible ",
New variant (makevisible ));
}
Public void createnewdocument ()
{
// Find the documents collection object maintained by word
Dispatch documents =
Dispatch. Get (mswordapp, "documents"). todispatch ();
// Call the add method of the documents collection to create
// A new document to edit
Document = Dispatch. Call (events, "add"). todispatch ();
}
Public void inserttext (string texttoinsert)
{
// Get the current selection within word at the moment. If
// A new document has just been created then this will be
// The top of the new doc
Dispatch selection =
Dispatch. Get (mswordapp, "selection"). todispatch ();
// Put the specified text at the insertion point
Dispatch. Put (selection, "text", texttoinsert );
}
Public void savefileas (string filename)
{
Dispatch. Call (document, "saveas", filename );
}
Public void printfile ()
{
// Just print the current document to the default printer
Dispatch. Call (document, "printout ");
}
Public void closedocument ()
{
// Close the document without saving changes
// 0 = wddonotsavechanges
//-1 = wdsavechanges
//-2 = wdprompttosavechanges
Dispatch. Call (document, "close", new variant (0 ));
Document = NULL;
}
Public void closeword ()
{
Dispatch. Call (mswordapp, "quit ");
Mswordapp = NULL;
Document = NULL;
}
}

 

2. Excel operations in Java Excel

Read data tables from Excel files

Java Excel API can read the Excel data table from the input stream either from a file (.xls) of the internal file system. The first step to read an Excel data table is to create a workbook. The following code snippet illustrates how to perform this operation: (for the complete code, see excelreading. Java)

Import java. Io .*;
Import jxl .*;
... ... ... ...
Try
{
// Construct a workbook object, read-only workbook object
// Create a workbook directly from a local file
// Create a workbook from the input stream
Inputstream is = new fileinputstream (sourcefile );
Jxl. Workbook RWB = Workbook. getworkbook (is );
}
Catch (exception E)
{
E. printstacktrace ();
}

Once a workbook is created, we can use it to access the Excel sheet (term: Worksheet ). Refer to the following code snippet:

// Obtain the first sheet
Sheet rs = RWB. getsheet (0 );

We can access it either by the sheet name or by subscript. If a subscript is used for access, note that the subscript starts from 0, just like an array.

Once the sheet is obtained, we can use it to access the Excel cell (term: Cell ). Refer to the following code snippet:

// Obtain the value of the first row and column
Cell c00 = Rs. getcell (0, 0 );
String strc00 = c00.getcontents ();

// Obtain the value of the first and second columns
Cell C10 = Rs. getcell (1, 0 );
String strc10 = c10.getcontents ();

// Obtain the value of the second row and the second column
Cell C11 = Rs. getcell (1, 1 );
String strc11 = c11.getcontents ();

System. Out. println ("cell (0, 0)" + "value:" + strc00 + "; Type:" + c00.gettype ());
System. Out. println ("cell (1, 0)" + "value:" + strc10 + "; Type:" + c10.gettype ());
System. Out. println ("cell (1, 1)" + "value:" + strc11 + "; Type:" + c11.gettype ());

If we only get the cell value, we can easily use the getcontents () method, which can return any type of cell value as a string. In the sample code, cell (0, 0) is text type, cell (1, 0) is numeric type, cell () is date type, through getcontents (), the return values of the three types are 'struct' type.

If you need to know the exact type of cell content, APIS also provide a series of methods. Refer to the following code snippet:

String strc00 = NULL;
Double strc10 = 0.00;
Date strc11 = NULL;

Cell c00 = Rs. getcell (0, 0 );
Cell C10 = Rs. getcell (1, 0 );
Cell C11 = Rs. getcell (1, 1 );

If (c00.gettype () = celltype. Label)
{
Labelcell labelc00 = (labelcell) c00;
Strc00 = labelc00.getstring ();
}
If (c10.gettype () = celltype. Number)
{
Nmbercell numc10 = (numbercell) C10;
Strc10 = numc10.getvalue ();
}
If (c11.gettype () = celltype. Date)
{
Datecell datec11 = (datecell) C11;
Strc11 = datec11.getdate ();
}

System. Out. println ("cell (0, 0)" + "value:" + strc00 + "; Type:" + c00.gettype ());
System. Out. println ("cell (1, 0)" + "value:" + strc10 + "; Type:" + c10.gettype ());
System. Out. println ("cell (1, 1)" + "value:" + strc11 + "; Type:" + c11.gettype ());

After obtaining the cell object, you can use the GetType () method to obtain the cell type. Then, it matches the basic type provided by the API and is forcibly converted to the corresponding type, finally, call the corresponding getxxx () method to obtain the value of the determined type. APIS provide the following basic types, which correspond to the Excel Data Format:

For more information about each type, see Java Excel API document.

 

After processing the data in an Excel worksheet, you must use the close () method to close the previously created object to release the memory space occupied by reading the data table, it is particularly important to read a large amount of data. See the following code snippet:

// When the operation is complete, close the object and release the occupied memory space
RWB. Close ();

Java Excel API provides many methods to access Excel Data Tables. Here I will briefly introduce several common methods. For other methods, see Java Excel API document in the appendix.

Workbook class provides methods

1. Int getnumberofsheets ()

Obtain the number of worksheets (sheet) in the workbook, for example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Int sheets = RWB. getnumberofsheets ();

2. Sheet [] getsheets ()

Returns an array of Worksheet (sheet) objects in the workbook, for example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Sheet [] sheets = RWB. getsheets ();

3. String getversion ()

Return the version number of the API in use.

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
String apiversion = RWB. getversion ();

Methods provided by the sheet Interface

1) string getname ()

Obtain the sheet name, for example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
String sheetname = Rs. getname ();

2) int getcolumns ()

Obtain the total number of columns in a sheet. For example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
Int rscolumns = Rs. getcolumns ();

3) cell [] getcolumn (INT column)

Obtains all cells in a column and returns an array of cell objects. For example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
Cell [] Cell = Rs. getcolumn (0 );

4) int getrows ()

Obtain the total number of rows in a sheet. Example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
Int rsrows = Rs. getrows ();

5) cell [] getrow (INT row)

Obtain all cells in a row and return an array of cell objects. Example:

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
Cell [] Cell = Rs. getrow (0 );

6) cell getcell (INT column, int row)

To obtain the object reference of a specified cell, note the two parameters. The first parameter is the number of columns and the second parameter is the number of rows, which is different from the common combination of rows and columns.

Jxl. Workbook RWB = jxl. Workbook. getworkbook (new file (sourcefile ));
Jxl. Sheet rs = RWB. getsheet (0 );
Cell cell = Rs. getcell (0, 0 );

 

2. generate a new Excel worksheet

The following code is used to show you how to generate a simple Excel worksheet. Here, the content of a cell does not contain any modification (such as font, color, and so on ), all contents are written as strings. (For the complete code, see excelwriting. Java)

Similar to reading an Excel worksheet, you must first use the workbook class factory method to create a writeable workbook object. Note that, you can create a workbook only by using the factory method provided by the API, instead of using the writableworkbook constructor, because the constructor of the writableworkbook class is of the protected type. The sample code snippet is as follows:

Import java. Io .*;
Import jxl .*;
Import jxl. Write .*;
... ... ... ...
Try
{
// Construct a workbook object, read-only workbook object
// Method 1: Create an Excel worksheet that can be written
Jxl. Write. writableworkbook WWB = Workbook. createworkbook (new file (targetfile ));

// Method 2: Write writableworkbook directly to the output stream
/*
Outputstream OS = new fileoutputstream (targetfile );
Jxl. Write. writableworkbook WWB = Workbook. createworkbook (OS );
*/
}
Catch (exception E)
{
E. printstacktrace ();
}

The API provides two methods to process writable output streams. One is to generate a local file directly. If the file name does not contain a full path, the default file is located in the current directory, if the file name contains a full path, the generated Excel file will be located in the corresponding directory; the other is to directly write the EXCEL object to the output stream, for example: the user accesses the web server through a browser. If the HTTP header is set correctly, the browser automatically calls the Excel application of the client to display the dynamically generated Excel Workbook.

The next step is to create a worksheet. The method for creating a worksheet is almost the same as that for creating a worksheet. The corresponding object is also obtained through the factory mode method. This method requires two parameters, one is the worksheet name, and the other is the position of the worksheet in the workbook. See the following code snippet:

// Create an Excel worksheet
Jxl. Write. writablesheet Ws = WWB. createsheet ("test sheet 1", 0 );

"This pot has been supported, and the materials are ready. You can start the pot! ", Now you only need to instantiate the basic Excel data types provided by the API and add them to the worksheet. refer to the following code snippet:

// 1. Add a label object
Jxl. Write. Label labelc = new jxl. Write. Label (0, 0, "This is a label cell ");
WS. addcell (labelc );

// Add an object with a font formatting
Jxl. Write. writablefont WF = new jxl. Write. writablefont (writablefont. Times, 18, writablefont. Bold, true );
Jxl. Write. writablecellformat wcff = new jxl. Write. writablecellformat (WF );
Jxl. Write. Label labelcf = new jxl. Write. Label (1, 0, "This is a label cell", wcff );
WS. addcell (labelcf );

// Add an object with the font color formatting
Jxl. Write. writablefont WFC = new jxl. Write. writablefont (writablefont. Arial, 10, writablefont. no_bold, false,
Underlinestyle. no_underline, jxl. format. colour. Red );
Jxl. Write. writablecellformat wcffc = new jxl. Write. writablecellformat (WFC );
Jxl. Write. Label labelcfc = new jxl. Write. Label (1, 0, "This is a label cell", wcffc );
WS. addcell (labelcf );

// 2. Add a number object
Jxl. Write. Number labeln = new jxl. Write. Number (0, 1, 3.1415926 );
WS. addcell (labeln );

// Add a number object with formatting
Jxl. Write. numberformat NF = new jxl. Write. numberformat ("#.##");
Jxl. Write. writablecellformat wcfn = new jxl. Write. writablecellformat (NF );
Jxl. Write. Number labelnf = new jxl. Write. Number (1, 1, 3.1415926, wcfn );
WS. addcell (labelnf );

// 3. Add a Boolean object
Jxl. Write. boolean labelb = new jxl. Write. boolean (0, 2, false );
WS. addcell (labelb );

// 4. Add a datetime object
Jxl. Write. datetime labeldt = new jxl. Write. datetime (0, 3, new java. util. Date ());
WS. addcell (labeldt );

// Add a dateformat object with formatting
Jxl. Write. dateformat df = new jxl. Write. dateformat ("dd mm yyyy hh: mm: SS ");
Jxl. Write. writablecellformat wcfdf = new jxl. Write. writablecellformat (DF );
Jxl. Write. datetime labeldtf = new jxl. Write. datetime (1, 3, new java. util. Date (), wcfdf );
WS. addcell (labeldtf );

There are two points worth your attention. First, when constructing a cell, the position of the cell in the worksheet is determined. After a cell is created, the position of the cell cannot be changed, although the content of the cell can be changed. Second, the cell is located according to the following rule (column, row), and the subscript starts from 0. For example, A1 is stored in (0, 0 ), b1 is stored in (1, 0 ).

Finally, do not forget to close the opened Excel worksheet object to release the occupied memory. See the following code snippet:

// Write the Exel Worksheet
WWB. Write ();

// Close the Excel worksheet object
WWB. Close ();

This may be slightly different from reading an Excel file. Before closing an Excel object, you must call the write () method first, because the previous operations are stored in the cache, therefore, you need to save the operation content in the file. If you close the EXCEL object, you can only get an empty workbook.

 

3. Copy and update the Excel worksheet

Next, we will briefly introduce how to update an existing worksheet, mainly in the following two steps. The first step is to construct a read-only Excel worksheet, the second step is to create a new writable Excel worksheet using the created Excel worksheet. refer to the following code snippet: (for the complete code, see excelmodifying. java)

// Create a read-only Excel worksheet object
Jxl. Workbook RW = jxl. Workbook. getworkbook (new file (sourcefile ));

// Create a writable Excel worksheet object
Jxl. Write. writableworkbook WWB = Workbook. createworkbook (new file (targetfile), RW );

// Read the first Worksheet
Jxl. Write. writablesheet Ws = WWB. getsheet (0 );

// Obtain the first cell object
Jxl. Write. writablecell WC = ws. getwritablecell (0, 0 );

// Determine the cell type and convert it accordingly
If (WC. GetType () = celltype. Label)
{
Label L = (Label) WC;
L. setstring ("the value has been modified .");
}

// Write an Excel Object
WWB. Write ();

// Close the Excel objects that can be written
WWB. Close ();

// Close the read-only Excel Object
RW. Close ();

The reason for using this method to build an Excel object is efficiency, because the above example is the main application of the API. To improve performance, some data-related output information and all format information, such as fonts and colors, are not processed when reading a worksheet, because our goal is to obtain the value of the row data, without modification, it will not affect the value of the row data. The only disadvantage is that two identical worksheets will be saved in the memory at the same time, so that when the worksheet size is large, it will occupy a considerable amount of memory, but now it seems that the memory size is not a key factor.

Once a writable worksheet object is obtained, we can update the cell object. Here we do not have to call the add () method provided by the API, because the cell is already in the worksheet, we only need to call the corresponding setxxx () method to complete the update operation.

The original cell Formatting cannot be removed. We can still add the new cell modifier so that the cell content can be displayed in different forms.

Newly generated worksheet objects can be written. In addition to updating the original unit, we can also add new cells to the worksheet, which is exactly the same as the operation in example 2.

Finally, do not forget to call the write () method, write the updated content to the file, and then close the thin object. Here there are two thin objects to be closed, one is read-only, the other is writable.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.