C # Read Doc,pdf,ppt,txt file

Source: Internet
Author: User

Conversion between Doc PDF ppt and txt:

The function of a component is generally to read the file into a character format, and is not simply a conversion file name suffix, so you need to read something to write to the TXT file.

Add Office Reference

When you program Word and PPT in office in. NET, make sure that you have the WORD,PPT programmable components installed when you install Office (which you can view when you customize your installation) or that you install Microsoft Office 2003 Primary Interop Assemblies "

After installation, add references on the programming page:

Add Reference-com-microsoft PowerPoint Object 11.0 libaray/word 11.0 Object Library;

You also have to add Office components

Using Microsoft.Office.Interop.Word;

Using Microsoft.Office.Interop.PowerPoint;

Using Org.pdfbox.pdmodel;

Using Org.pdfbox.util;

Using Microsoft.Office.Interop.Word;

Using Microsoft.Office.Interop.PowerPoint;

Publicvoid pdf2txt (FileInfo file,fileinfo txtfile)

{

PDDocument Doc =pddocument.load (file. FullName);

Pdftextstripper pdfstripper =newpdftextstripper ();

string text = Pdfstripper.gettext (DOC);

StreamWriter swpdfchange =newstreamwriter (txtfile. Fullname,false,encoding.getencoding ("gb2312"));

Swpdfchange.write (text);

Swpdfchange.close ();

}

For a table in a doc file, the result of the read is that the grid line is removed and the content is read by row.

public void Word2text (FileInfo file,fileinfo txtfile)

{

Object ReadOnly =true;

Object missing = System.Reflection.Missing.Value;

Object fileName = file. FullName;

Microsoft.Office.Interop.Word.ApplicationClass WordApp =new Microsoft.Office.Interop.Word.ApplicationClass ();

Document doc = WordApp. Documents.Open (ref fileName,

Ref missing,ref Readonly,ref Missing, ref missing,ref missing,

Ref missing,ref Missing,ref Missing, ref missing,ref missing,

Ref missing,ref Missing,ref Missing, ref missing,ref missing);

string text = Doc. Content.text;

Doc. Close (ref missing,ref missing,ref missing);

WordApp. Quit (ref missing,ref missing,ref missing);

StreamWriter swwordchange =new StreamWriter (txtfile. Fullname,false,encoding.getencoding ("gb2312"));

Swwordchange.write (text);

Swwordchange.close ();

}

public void Ppt2txt (FileInfo file, FileInfo txtfile)

{

Microsoft.Office.Interop.PowerPoint.Application pa =new Microsoft.Office.Interop.PowerPoint.ApplicationClass ();

Microsoft.Office.Interop.PowerPoint.Presentation PP = Pa. Presentations.Open (file. FullName,

Microsoft.Office.Core.MsoTriState.msoTrue,

Microsoft.Office.Core.MsoTriState.msoFalse,

Microsoft.Office.Core.MsoTriState.msoFalse);

string pps = "";

StreamWriter swpptchange =new StreamWriter (txtfile. Fullname,false,encoding.getencoding ("gb2312"));

foreach (Microsoft.Office.Interop.PowerPoint.Slide Slidein pp. Slides)

{

foreach (Microsoft.Office.Interop.PowerPoint.Shape shapein slide. Shapes)

PPS + = shape. TextFrame.TextRange.Text.ToString ();

}

Swpptchange.write (PPS);

Swpptchange.close ();

}

Read different types of files

Public StreamReader text2reader (FileInfo file)

{

StreamReader St =null;

Switch (file. Extension.tolower ())

{

Case ". txt":

st = new StreamReader (file. Fullname,encoding.getencoding ("gb2312"));

Break

Case ". Doc":

FileInfo wordfile =new FileInfo (@ "e:/my programs/200807program/filesearch/app_data/word2txt.txt");//cannot use relative path, try to improve

Word2text (file, wordfile);

st = Newstreamreader (wordfile. Fullname,encoding.getencoding ("gb2312"));

Break

Case ". pdf":

FileInfo pdffile =new FileInfo (@ "e:/my programs/200807program/filesearch/app_data/pdf2txt.txt");

Pdf2txt (file, pdffile);

st = new StreamReader (pdffile. Fullname,encoding.getencoding ("gb2312"));

Break

Case ". ppt":

FileInfo pptfile =new FileInfo (@ "e:/my programs/200807program/filesearch/app_data/ppt2txt.txt");

Ppt2txt (File,pptfile);

st = new StreamReader (pptfile. Fullname,encoding.getencoding ("gb2312"));

Break

}

Return St;

}

C # Read Doc,pdf,ppt,txt file

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.