C # extract PPT text and picture implementation scheme

Source: Internet
Author: User

In a text -to-text document, we can extract textual information or images from the document as needed,C #code can be extractedWordand thePDFfiles in the text and pictures, then the same, we can also extractPPTthe text and pictures in the slide. This document will describe how to use theC #to implement extractionPPTthe manipulation of text and pictures. The first is the need to install componentsspire.presentation, and then add a referenceDLLfile into the project. The following are the main code steps.

Original document:

650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/5c457371f706afa2a1849c02de436ace.png "style=" float: none; "title=" 21.png "alt=" 5c457371f706afa2a1849c02de436ace.png "/>


1. extracting text

Step One: creating a presentation instance and loading the document

Presentation Presentation = newpresentation (@ "C:\Users\Administrator\Desktop\sample.pptx", fileformat.pptx2010);

Step Two: Create a StringBuilder object

StringBuilder sb = Newstringbuilder ();

Step three: Traverse the slides and the graphics on the slide to extract the text content

 foreach  (islide slide in presentation. Slides)             {                 foreach  (ishape shape in  slide. Shapes)                 {                      if  (Shape isiautoshape)                      {                         foreach  ( textparagraph tp in  (Shape asiautoshape). textframe.paragraphs)                          {                              SB. Append (TP. Text +environment.newline);                         }                     }                 }             }

Step four: Write TXT document

File.writealltext ("Target.txt", sb.) ToString ()); Process.Start ("Target.txt");


650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/5be16c77c6ea256a5e4e98d1855a7023.png "style=" float: none; "title=" 22.png "alt=" 5be16c77c6ea256a5e4e98d1855a7023.png "/>


2. Extract pictures

There are two cases of extracting pictures here, one is to extract all the pictures in the entire document, and the other is to extract only the pictures from a particular slide in the document.

2.1 extract All pictures

Step One: Initialize an instance of the presentation class and load the document

Presentation ppt = newpresentation ();pp T. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");

Step two: Traverse the picture in the document, extract the picture and save

for (int i = 0; I <ppt. Images.count; i++) {image image = ppt. Images[i].     Image; Image. Save (String. Format (@ ": \.. \images{0}.png ", i)); }

The extracted picture has been saved to the project folder


650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/ea2a491c2e1e45c15501bee5b25b47fb.png "style=" float: none; "title=" 23.png "alt=" Ea2a491c2e1e45c15501bee5b25b47fb.png "/>

2.2. extracting pictures from a specific slide

Step One: Create an instance of the presentation class and load the document

Presentation PPT = NewPresentation (); Ppt. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");

Step two: Get the third slide, extract and save the picture

int i = 0;foreach (IShape s inppt.slides[2].       Shapes) {if (s isslidepicture) {slidepicture PS = s asslidepicture; Ps. PictureFill.Picture.EmbedImage.Image.Save (String.        Format ("{0}.png", i));    i++;        } if (s ispictureshape) {Pictureshape PS = s aspictureshape; Ps. EmbedImage.Image.Save (String.        Format ("{0}.png", i));    i++; }}

The picture from the third slide you extracted is saved to the specified location


650) this.width=650; "src=" Https://s4.51cto.com/oss/201711/02/9ac544f080e7be678df0f9651913b671.png "style=" float: none; "title=" 24.png "alt=" 9ac544f080e7be678df0f9651913b671.png "/>


The above shows how to extract text and pictures, the steps are simple and practical, I hope to help you, thank you for reading!

(If you want to reprint, please specify the source and author)

This article is from the "E-iceblue" blog, make sure to keep this source http://eiceblue.blog.51cto.com/13438008/1978495

C # extract PPT text and picture implementation scheme

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.