java使用poi讀取ppt檔案和poi讀取excel、word樣本_java

來源:互聯網
上載者:User

Apache的POI項目可以用來處理MS Office文檔,codeplex上還有一個它的.net版本。POI項目可建立和維護操作各種基於OOXML和OLE2檔案格式的Java API。大多數MS Office都是OLE2格式的。POI通HSMF子項目來支援Outlook,通過HDGF子項目來支援Visio,通過HPBF子項目來支援Publisher。

使用POI抽取Word簡單樣本:

要引入poi-3.7.jat和poi-scratchpad-3.7.ajr這兩個包。

複製代碼 代碼如下:

package msoffice;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.hwpf.usermodel.Section;

public class Word {

    // 直接抽取全部內容
    public static String readDoc1(InputStream is) throws IOException {
        WordExtractor extractor = new WordExtractor(is);
        return extractor.getText();
    }

    //分章節Section、段落Paragraph、字串CharacterRun抽取
    public static void readDoc2(InputStream is) throws IOException {
        HWPFDocument doc=new HWPFDocument(is);
        Range r=doc.getRange();
        for(int x=0;x<r.numSections();x++){
            Section s=r.getSection(x);
            for(int y=0;y<s.numParagraphs();y++){
                Paragraph p=s.getParagraph(y);
                for(int z=0;z<p.numCharacterRuns();z++){
                    CharacterRun run=p.getCharacterRun(z);
                    String text=run.text();
                    System.out.print(text);
                }
            }
        }
    }

    public static void main(String[] args) {
        File file = new File("/home/orisun/1.doc");
        try {
            FileInputStream fin = new FileInputStream(file);
            String cont = readDoc1(fin);
            System.out.println(cont);
            fin.close();
            fin = new FileInputStream(file);
            readDoc2(fin);
            fin.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

POI抽取PPT樣本:

複製代碼 代碼如下:

package msoffice;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import org.apache.poi.hslf.HSLFSlideShow;
import org.apache.poi.hslf.extractor.PowerPointExtractor;
import org.apache.poi.hslf.model.Slide;
import org.apache.poi.hslf.model.TextRun;
import org.apache.poi.hslf.usermodel.SlideShow;

public class PPT {

    //直接抽取投影片的全部內容
    public static String readDoc1(InputStream is) throws IOException{
        PowerPointExtractor extractor=new PowerPointExtractor(is);
        return extractor.getText();
    }

    //一張投影片一張投影片地讀取
    public static void readDoc2(InputStream is) throws IOException{
        SlideShow ss=new SlideShow(new HSLFSlideShow(is));
        Slide[] slides=ss.getSlides();
        for(int i=0;i<slides.length;i++){
            //讀取一張投影片的標題
            String title=slides[i].getTitle();
            System.out.println("標題:"+title);
            //讀取一張投影片的內容(包括標題)
            TextRun[] runs=slides[i].getTextRuns();
            for(int j=0;j<runs.length;j++){
                System.out.println(runs[j].getText());
            }
        }
    }

    public static void main(String[] args){
        File file = new File("/home/orisun/2.ppt");
        try{
            FileInputStream fin=new FileInputStream(file);
            String cont=readDoc1(fin);
            System.out.println(cont);
            fin.close();
            fin=new FileInputStream(file);
            readDoc2(fin);
            fin.close();
        }catch(IOException e){
            e.printStackTrace();
        }
    }
}

Excel檔案由多個Workbook組成,一個Workbook由多個Sheet組成。

POI抽取Excel簡單樣本:

複製代碼 代碼如下:

package msoffice;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Iterator;

import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hssf.extractor.ExcelExtractor;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.ss.usermodel.Row;

public class Excel {

    //直接讀取Excel的全部內容
    public static String readDoc1(InputStream is)throws IOException{
        HSSFWorkbook wb=new HSSFWorkbook(new POIFSFileSystem(is));
        ExcelExtractor extractor=new ExcelExtractor(wb);
        extractor.setFormulasNotResults(false);
        extractor.setIncludeSheetNames(true);
        return extractor.getText();
    }

    //讀取時細化到Sheet、行甚至儲存格
    public static double getAvg(InputStream is)throws IOException{
        HSSFWorkbook wb=new HSSFWorkbook(new POIFSFileSystem(is));
        //擷取第一張sheet
        HSSFSheet sheet=wb.getSheetAt(0);
        double molecule=0.0;
        double denominator=0.0;
        //按行遍曆sheet
        Iterator<Row> riter=sheet.rowIterator();
        while(riter.hasNext()){
            HSSFRow row=(HSSFRow)riter.next();
            HSSFCell cell1=row.getCell(4);
            HSSFCell cell2=row.getCell(4);
            if(cell1.getCellType()!=HSSFCell.CELL_TYPE_NUMERIC){
                System.err.println("數字類型錯誤!");
                System.exit(-2);
            }
            if(cell2.getCellType()!=HSSFCell.CELL_TYPE_NUMERIC){
                System.err.println("數字類型錯誤!");
                System.exit(-2);
            }
            denominator+=Double.parseDouble(cell2.toString().trim());
            molecule+=Double.parseDouble(cell2.toString().trim())*Float.parseFloat(cell1.toString().trim());
        }
        return molecule/denominator;
    }

    public static void main(String[] args){
        File file = new File("/home/orisun/3.xls");
        try{
            FileInputStream fin=new FileInputStream(file);
            String cont=readDoc1(fin);
            System.out.println(cont);
            fin.close();
            fin=new FileInputStream(file);
            System.out.println("加權平均分"+getAvg(fin));
            fin.close();
        }catch(IOException e){
            e.printStackTrace();
        }
    }
}

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.