Word
Use Java to convert Word to HTML or txt. Some time ago for this problem headache, and then looked at the data finally resolved, the program is released for later reference.
|
//------------------------------------------------------- Copyright (C) wave Group Commercial Systems Co., ltd All rights reserved File name: wordtohtml file version: 1.00.00 Author: Guo author e-mail:guozhu@langchao.comCompletion Date: 2004-10-20 File Description: Other Description: Class List: Wordtohtml: Converts all doc files under the specified directory to HTML and stores them in the same directory Modify History: # version Modification date author changes content // ------------------------------------------- 1 1.00.01 2004-10-14 Author name modify content description // ---------------------------------------------------------- //------------------------------------------------------- Import com.jacob.com.*; Import com.jacob.activex.*; Import java.io.*;//Get all doc file names under the specified directory public class wordtohtml { //------------------------------------------------- br>//Method Prototype: Change (String paths) //Feature Description: Converts all doc files under the specified directory to HTML and stores them in the same directory //input parameters: String //Output parameters: None //Return return value: No //Other Description: Recursive //-------------------------------------------- public static void Change (String paths, String savepaths) { File D = new file (paths); Gets the list of all files and directories under the current folder file lists[] = D.listfiles (); String PATHSS = new String (""); Retrieve all files below the current directory for (int i = 0; i < lists.length i + +) { if (Lists[i].isfile ()) { String filename = Lists[i].getname (); String filetype = new String (""); Get file type filetype = filename.substring (Filename.length ()-3), filename.length ()); To determine whether DOC files if (Filetype.equals ("Doc")) { System.out.println ("Currently transitioning ..."); Print current directory path System.out.println (paths); Print Doc file name System.out.println (filename.substring (0, (Filename.length ()-4)); Activexcomponent app = new Activexcomponent ("Word.Application");//Start Word String Docpath = paths + filename; String Htmlpath = savepaths + filename.substring (0, (Filename.length ()-4)); String inFile = Docpath; The Word file to convert String tpfile = Htmlpath; HTML file Boolean flag = false; Try { App.setproperty ("Visible", new Variant (false)); Set Word not visible Object docs = app.getproperty ("Documents"). Todispatch (); Object doc = Dispatch.invoke (docs, "Open", Dispatch.method, New Object[]{infile,new variant (FALSE), new variant (TRUE)}, New Int[1]). Todispatch (); Open a Word file Dispatch.invoke (Doc, "SaveAs", Dispatch.method, New Object[]{tpfile,new Variant (8)}, new int[1]); Save to temp file as HTML format Variant F = new Variant (FALSE); Dispatch.call (Doc, "Close", f); Flag = true; } catch (Exception e) { E.printstacktrace (); } Finally { App.invoke ("Quit", new variant[] {}); } SYSTEM.OUT.PRINTLN ("Transformation completed!") "); } } Else { PATHSS = paths; Go to the next level directory PATHSS = Pathss + lists[i].getname () + "\"; Recursively iterate through all directories Change (PATHSS, savepaths); } } } //--------------------------------------------------------- Method prototype: Main (string[] args) Feature Description: Main file Input parameters: None Output parameters: None return value: None Other Notes: None //---------------------------------------------------------- public static void Main (string[] args) { String paths = new String ("D:\\work\\2004.10.8\\test system\\test01\\word\\"); String savepaths = new String ("D:\\work\\2004.10.8\\test system\\test01\\html\\");Change (paths, savepaths); } } |
The import of the jar package is an open source Dongdong, online search is available.
Dispatch.invoke (Doc, "SaveAs", Dispatch.method, New object[]{tpfile,new Variant (8)}, new int[1]);
Modify Variant (8)} with parameters to convert Word to various types.