Use POI to operate images in different versions of Word documents and create Word documents.
We all know that the most common technology to use java for office operations should be POI. Here I will not talk about what POI is and how it is used. Let me talk about my problems first. It is different from using POI to write data into Word documents and excel documents, exporting data out, and saving the data to the database, due to business needs, you need to use POI to read images in word and save the images as a file. The operation method of the images contained in the file.
For more information, see the code:
The first step is to operate the image in the Word document ending with the .docx file:
package poi;import java.io.File;import java.io.FileInputStream;import java.io.FileOutputStream;import java.io.IOException;import java.util.List;import org.apache.poi.xwpf.extractor.XWPFWordExtractor;import org.apache.poi.xwpf.usermodel.XWPFDocument;import org.apache.poi.xwpf.usermodel.XWPFPictureData;public class GetPics {public static void main(String[] args) {String path ="F:\\xx.docx"; File file = new File(path); try { FileInputStream fis = new FileInputStream(file); XWPFDocument document = new XWPFDocument(fis); XWPFWordExtractor xwpfWordExtractor = new XWPFWordExtractor(document); String text = xwpfWordExtractor.getText(); System.out.println(text); List<XWPFPictureData> picList = document.getAllPictures(); for (XWPFPictureData pic : picList) { System.out.println(pic.getPictureType() + file.separator + pic.suggestFileExtension() +file.separator+pic.getFileName()); byte[] bytev = pic.getData(); FileOutputStream fos = new FileOutputStream("d:\\"+pic.getFileName()); fos.write(bytev); } fis.close(); } catch (IOException e) { e.printStackTrace(); }}}
Specify the image in the word file:
Console output information:
Image files generated at the specified disk location:
After all, the operations on the images in the Word documents ending with .doc are as follows:
Unlike the operation class of the higher version, the operation class of the word version 03 is used here:
package com.zjcx.read;import java.io.*;import java.util.*;import org.apache.poi.hwpf.HWPFDocument;import org.apache.poi.hwpf.model.PicturesTable;import org.apache.poi.hwpf.usermodel.CharacterRun;import org.apache.poi.hwpf.usermodel.Picture;import org.apache.poi.hwpf.usermodel.Range;public class ReadImg { public static void main(String[] args) throws Exception {new ReadImg().readPicture("F://test//test.doc");}private void readPicture(String path)throws Exception{ FileInputStream in=new FileInputStream(new File(path)); HWPFDocument doc=new HWPFDocument(in); int length=doc.characterLength(); PicturesTable pTable=doc.getPicturesTable();// int TitleLength=doc.getSummaryInformation().getTitle().length(); // System.out.println(TitleLength); // System.out.println(length); for (int i=0;i<length;i++){ Range range=new Range(i, i+1,doc); CharacterRun cr=range.getCharacterRun(0); if(pTable.hasPicture(cr)){ Picture pic=pTable.extractPicture(cr, false);String afileName=pic.suggestFullFileName();OutputStream out=new FileOutputStream(new File("F:\\test\\"+UUID.randomUUID()+afileName));pic.writeImageContent(out); } }}}
The following generation results are the same as those for reading images in word and generating new images in later versions.
After completing the operations on images in different versions of word documents, I also encountered the word creation business. Different from generating a txt File using a stream, it is also different from creating a file directly, and then calling the createNew method of File. Let's take a look at the use of POI code and other code to create or create a new one (in fact, it is only a recent imitation of a word, but it must be different from manual creation, what is the specific difference? I am still not quite familiar with it. If you have read the following code, I hope you can give me some advice ~) One word, not much gossip, please refer to the Code:
First, create the word file ending with .doc. (I am not posting the generated file here. You can try it)
package poi;import java.io.ByteArrayInputStream;import java.io.FileOutputStream;import java.io.IOException;import org.apache.poi.poifs.filesystem.DirectoryEntry;import org.apache.poi.poifs.filesystem.POIFSFileSystem;public class GenWord03 {public static void main(String[] args) throws IOException {String path = "F:/";String filename = "/123321.doc"; String content="";byte[] b = content.getBytes("UTF-8");ByteArrayInputStream bais = new ByteArrayInputStream(b);POIFSFileSystem poifs = new POIFSFileSystem();DirectoryEntry dirEntry = poifs.getRoot();dirEntry.createDocument("WordDocument", bais);FileOutputStream out = new FileOutputStream(path + filename);poifs.writeFilesystem(out);out.flush();out.close();bais.close();}}
The Creator creates a word file ending with .docx.
Package poi; import java. io. file; import java. io. fileNotFoundException; import java. io. fileOutputStream; import java. io. IOException; import org. apache. poi. xwpf. usermodel. XWPFDocument; import org. apache. poi. xwpf. usermodel. XWPFParagraph; import org. apache. poi. xwpf. usermodel. XWPFRun; public class GenNewWord {public static void main (String [] args) throws IOException {String content = "content to be displayed"; String path = "F :/"; string filename = "/xxx.doc"; XWPFDocument doc = new XWPFDocument (); XWPFParagraph para = doc. createParagraph (); XWPFRun run = para. createRun (); run. setText (content); File file = new File (path + filename); FileOutputStream out = new FileOutputStream (file); doc. write (out); out. close ();}}
Note: For veterans who are familiar with POI operations, they may know that the operations provided by Apache are limited for Word document operations of version 03, but more for an existing (existing) so we will find in the blog or post of major netizens that the code for the old word document operation is to read an existing word operation. For the 072.16.docx-ending word documents, the developer can use POI to operate the entire lifecycle of a word file (that is, the process from scratch ). This is just a brief summary of myself. I hope to see more comments and exchanges on this blog.
How to use Java and POI technology to read Word documents and display the original format of the Word documents on the page?
You can read the elements and convert them into html elements.
However, many word special effects cannot be displayed in html. For specific practices, refer to the official poi documentation. The official documentation is very detailed.
How does java poi insert images into Word documents?
Poi.apache.org/hwpf/quick-guide.html