Use poi to operate images in different versions of Word documents and create Word documents

Source: Internet
Author: User

We all know that the most common technology to use Java for office operations should be POI. Here I will not talk about what poi is and how it is used. Let me talk about my problems first. It is different from using poi to write data into Word documents and Excel documents, exporting data out, and saving the data to the database, due to business needs, you need to use poi to read images in word and save the images as a file. The operation method of the images contained in the file.


For more information, see the code:

The first step is to operate the image in the Word document ending with the .docx file:
package poi;import java.io.File;import java.io.FileInputStream;import java.io.FileOutputStream;import java.io.IOException;import java.util.List;import org.apache.poi.xwpf.extractor.XWPFWordExtractor;import org.apache.poi.xwpf.usermodel.XWPFDocument;import org.apache.poi.xwpf.usermodel.XWPFPictureData;public class GetPics {public static void main(String[] args) {String path ="F:\\xx.docx";        File file = new File(path);        try {            FileInputStream fis = new FileInputStream(file);            XWPFDocument document = new XWPFDocument(fis);            XWPFWordExtractor xwpfWordExtractor = new XWPFWordExtractor(document);            String text = xwpfWordExtractor.getText();            System.out.println(text);            List<XWPFPictureData> picList = document.getAllPictures();            for (XWPFPictureData pic : picList) {                System.out.println(pic.getPictureType() + file.separator + pic.suggestFileExtension()                        +file.separator+pic.getFileName());                byte[] bytev = pic.getData();                FileOutputStream fos = new FileOutputStream("d:\\"+pic.getFileName());                 fos.write(bytev);            }            fis.close();        } catch (IOException e) {            e.printStackTrace();        }}}

Specify the image in the Word file:


Console output information:


Image files generated at the specified disk location:



After all, the operations on the images in the Word documents ending with .doc are as follows:

Unlike the operation class of the higher version, the operation class of the Word version 03 is used here:

package com.zjcx.read;import java.io.*;import java.util.*;import org.apache.poi.hwpf.HWPFDocument;import org.apache.poi.hwpf.model.PicturesTable;import org.apache.poi.hwpf.usermodel.CharacterRun;import org.apache.poi.hwpf.usermodel.Picture;import org.apache.poi.hwpf.usermodel.Range;public class ReadImg { public static void main(String[] args) throws Exception {new ReadImg().readPicture("F://test//test.doc");}private void readPicture(String path)throws Exception{ FileInputStream in=new FileInputStream(new File(path));  HWPFDocument doc=new HWPFDocument(in);  int length=doc.characterLength(); PicturesTable pTable=doc.getPicturesTable();// int TitleLength=doc.getSummaryInformation().getTitle().length(); //  System.out.println(TitleLength);  // System.out.println(length);   for (int i=0;i<length;i++){   Range range=new Range(i, i+1,doc);      CharacterRun cr=range.getCharacterRun(0);   if(pTable.hasPicture(cr)){   Picture pic=pTable.extractPicture(cr, false);String afileName=pic.suggestFullFileName();OutputStream out=new FileOutputStream(new File("F:\\test\\"+UUID.randomUUID()+afileName));pic.writeImageContent(out);  }   }}}
The following generation results are the same as those for reading images in word and generating new images in later versions.


After completing the operations on images in different versions of Word documents, I also encountered the word creation business. Different from generating a TXT file using a stream, it is also different from creating a file directly, and then calling the createnew method of file. Let's take a look at the use of POI code and other code to create or create a new one (in fact, it is only a recent imitation of a word, but it must be different from manual creation, what is the specific difference? I am still not quite familiar with it. If you have read the following code, I hope you can give me some advice ~) One word, not much gossip, please refer to the Code:

First, create the Word file ending with .doc. (I am not posting the generated file here. You can try it)

package poi;import java.io.ByteArrayInputStream;import java.io.FileOutputStream;import java.io.IOException;import org.apache.poi.poifs.filesystem.DirectoryEntry;import org.apache.poi.poifs.filesystem.POIFSFileSystem;public class GenWord03 {public static void main(String[] args) throws IOException {String path = "F:/";String filename = "/123321.doc"; String content="";byte[] b = content.getBytes("UTF-8");ByteArrayInputStream bais = new ByteArrayInputStream(b);POIFSFileSystem poifs = new POIFSFileSystem();DirectoryEntry dirEntry = poifs.getRoot();dirEntry.createDocument("WordDocument", bais);FileOutputStream out = new FileOutputStream(path + filename);poifs.writeFilesystem(out);out.flush();out.close();bais.close();}}

The Creator creates a Word file ending with .docx.

Package poi; import Java. io. file; import Java. io. filenotfoundexception; import Java. io. fileoutputstream; import Java. io. ioexception; import Org. apache. poi. xwpf. usermodel. xwpfdocument; import Org. apache. poi. xwpf. usermodel. xwpfparagraph; import Org. apache. poi. xwpf. usermodel. xwpfrun; public class gennewword {public static void main (string [] ARGs) throws ioexception {string content = "content to be displayed"; string Path = "F :/"; string filename = "/xxx.doc"; xwpfdocument Doc = new xwpfdocument (); xwpfparagraph para = Doc. createparagraph (); xwpfrun run = para. createrun (); run. settext (content); file = new file (path + filename); fileoutputstream out = new fileoutputstream (File); Doc. write (out); out. close ();}}

Note: For veterans who are familiar with poi operations, they may know that the operations provided by Apache are limited for Word document operations of version 03, but more for an existing (existing) so we will find in the blog or post of major netizens that the code for the old word document operation is to read an existing word operation. For the 072.16.docx-ending Word documents, the developer can use poi to operate the entire lifecycle of a Word file (that is, the process from scratch ). This is just a brief summary of myself. I hope to see more comments and exchanges on this blog.



Use poi to operate images in different versions of Word documents and create Word documents

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.