In a text -to-text document, we can extract textual information or images from the document as needed,C #code can be extractedWordand thePDFfiles in the text and pictures, then the same, we can also extractPPTthe text and pictures in the slide. This document will describe how to use theC #to implement extractionPPTthe manipulation of text and pictures. The first is the need to install componentsspire.presentation, and then add a referenceDLLfile into the project. The following are the main code steps.
Original document:
650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/5c457371f706afa2a1849c02de436ace.png "style=" float: none; "title=" 21.png "alt=" 5c457371f706afa2a1849c02de436ace.png "/>
1. extracting text
Step One: creating a presentation instance and loading the document
Presentation Presentation = newpresentation (@ "C:\Users\Administrator\Desktop\sample.pptx", fileformat.pptx2010);
Step Two: Create a StringBuilder object
StringBuilder sb = Newstringbuilder ();
Step three: Traverse the slides and the graphics on the slide to extract the text content
foreach (islide slide in presentation. Slides) { foreach (ishape shape in slide. Shapes) { if (Shape isiautoshape) { foreach ( textparagraph tp in (Shape asiautoshape). textframe.paragraphs) {  SB. Append (TP. Text +environment.newline); } } } }
Step four: Write TXT document
File.writealltext ("Target.txt", sb.) ToString ()); Process.Start ("Target.txt");
650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/5be16c77c6ea256a5e4e98d1855a7023.png "style=" float: none; "title=" 22.png "alt=" 5be16c77c6ea256a5e4e98d1855a7023.png "/>
2. Extract pictures
There are two cases of extracting pictures here, one is to extract all the pictures in the entire document, and the other is to extract only the pictures from a particular slide in the document.
2.1 extract All pictures
Step One: Initialize an instance of the presentation class and load the document
Presentation ppt = newpresentation ();pp T. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");
Step two: Traverse the picture in the document, extract the picture and save
for (int i = 0; I <ppt. Images.count; i++) {image image = ppt. Images[i]. Image; Image. Save (String. Format (@ ": \.. \images{0}.png ", i)); }
The extracted picture has been saved to the project folder
650) this.width=650; "src=" Https://s2.51cto.com/oss/201711/02/ea2a491c2e1e45c15501bee5b25b47fb.png "style=" float: none; "title=" 23.png "alt=" Ea2a491c2e1e45c15501bee5b25b47fb.png "/>
2.2. extracting pictures from a specific slide
Step One: Create an instance of the presentation class and load the document
Presentation PPT = NewPresentation (); Ppt. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");
Step two: Get the third slide, extract and save the picture
int i = 0;foreach (IShape s inppt.slides[2]. Shapes) {if (s isslidepicture) {slidepicture PS = s asslidepicture; Ps. PictureFill.Picture.EmbedImage.Image.Save (String. Format ("{0}.png", i)); i++; } if (s ispictureshape) {Pictureshape PS = s aspictureshape; Ps. EmbedImage.Image.Save (String. Format ("{0}.png", i)); i++; }}
The picture from the third slide you extracted is saved to the specified location
650) this.width=650; "src=" Https://s4.51cto.com/oss/201711/02/9ac544f080e7be678df0f9651913b671.png "style=" float: none; "title=" 24.png "alt=" 9ac544f080e7be678df0f9651913b671.png "/>
The above shows how to extract text and pictures, the steps are simple and practical, I hope to help you, thank you for reading!
(If you want to reprint, please specify the source and author)
This article is from the "E-iceblue" blog, make sure to keep this source http://eiceblue.blog.51cto.com/13438008/1978495
C # extract PPT text and picture implementation scheme