starting today, I will also develop a good habit of documenting the problems and solutions encountered in development.
The recent development of an Android project requires the ability to view word and PDF documents, and since Android does not directly display the components of Word and PDF documents, only one webview can view HTML pages, so it is decided to convert the document to HTML on the server side. After that, whether online preview or download to the mobile terminal can be viewed directly.
Recently found on the Internet to find the use of Jacob to convert Word to HTML, in addition to taking up more CPU performance, seems to be good (. doc and. docx can be converted. )。 Nonsense not to say, cut to the chase, this article will first introduce the process of converting Word to HTML, PDF is still in the study, if there are results I will send out.
"Jacob is a java-com middleware. You can invoke COM components and WIN32 libraries in Java applications through this component. ”
Ps:jacob can only be used for Windows systems, if your system is not windows, Recommended to use OpenOffice.org, this is cross-platform, although I am useless, but should not be troublesome, it is necessary to install the software OpenOffice First, and then use 8100 services. As for POI, to tell the truth, I really do not like to use, that need to parse word, and then write their own HTML, the workload is not said, also outweigh the gains, because it is difficult to ensure that the converted HTML content format and the original Word document format consistent, and. docx conversion is also difficult.
1, to the official website to download Jacob, currently the latest version is 1.17, address link: http://sourceforge.net/projects/jacob-project/
2, after the compressed package decompression, Jacob.jar added to the libraries (first copied to the project directory, right click the jar package Select Build Path->add to build Path);
3, place the Jacob.dll under the "Jre\bin" used by the current project (for example, the JRE path my eclipse is using is D:\Java\jdk1.7.0_17\jre\bin).
Ps: I just follow the steps above to configure, a little problem does not, but some people may also complain, such as: Java.lang.UnsatisfiedLinkError:no Jacob in Java.library.path, this is the system is not loaded into the Jacob.dll, online solution is to put the Jacob.dll under the "Windows\System32" (I have not tried, because my direct no problem).
Java code: [Java] View plain copy public class jacobutil { // 8 on behalf of Word Save as html public static final int WORD_HTML = 8; public static void main (String[] args) { String docfile = "C:\\users\\ \\Desktop\\xxx.doc"; String htmlfile = "C:\\users\\ \\Desktop\\xxx.html"; jacobutil.wordtohtml (Docfile, htmlfile); } /** * word Turn html * @param Docfile word file full path * @param htmlfile Post-conversion HTML store path */ Public static void wordtohtml (String docfile, string htmlfile) { // Start Word Application (microsoft office word 2003) activexcomponent app = new activexcomponent ("Word.Application"); &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;SYSTEM.OUT.PRINTLN ("* * * are converting ... * * *); try { // settingsWord applications are not visible App.setproperty ("Visible", new variant (false)); // documents represents all document windows for a Word program, (Word is a multiple-document application) Dispatch docs = App.getproperty ("Documents"). Todispatch (); // Open the Word file you want to convert dispatch doc = dispatch.invoke ( docs, "Open", Dispatch.Method, new object[] { docfile, new variant (False), new variant (True) }, new int[1 ]. Todispatch (); // save to temporary file as HTML format dispatch.invoke (doc, "SaveAs", dispatch.method, new object[] { &NBsp; htmlfile, new variant (word_html) }, new int[1]); // closing Word files <