C # extract PPT text and picture implementation scheme

Source: Internet
Author: User

In a text-to-text document, we can extract the textual information or pictures in the document as needed, and we can extract the text and images from the word and PDF files by C # code, and then we will also be able to extract the text and images from the ppt slides. This document will describe how to use C # to extract ppt text and pictures. The first is also the need to install the component Spire.presentation, and then add the reference DLL file to the project. The following are the main code steps.

Original document:

1. Extracting text

Step One: Create a presentation instance and load the document

New Presentation (@ "C:\Users\Administrator\Desktop\sample.pptx", fileformat.pptx2010);

Step two: Create a StringBuilder object

New StringBuilder ();

Step three: Traverse the slides and the graphics on the slide to extract the text content

 foreach  (islide slide in   presentation. Slides) { foreach  (ishape shape in   slide. Shapes) { if  (Shape is   Iautoshape) { foreach  (textparagraph tp in  (Shape as   Iautoshape). textframe.paragraphs) {sb. Append (TP.                        Text  + Environment.NewLine); }                    }                }            }

Step four: Write TXT document

File.writealltext ("target.txt", sb.) ToString ()); Process.Start ("target.txt");

2. Extract Pictures

There are two cases of extracting pictures here, one is to extract all the pictures in the entire document, and the other is to extract only the pictures from a particular slide in the document.

2.1 Extract All pictures

Step One: Initialize an instance of the presentation class and load the document

New Presentation (); ppt. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");

Step two: Traverse the picture in the document, extract the picture and save

 for (int0; i < ppt.) Images.count; i++) {     = ppt. Images[i]. Image;     Image. Save (string. Format (@ ": \.. \images{0}.png", I));}

The extracted picture has been saved to the project folder

2.2. extracting pictures from a specific slide

Step One: Create an instance of the presentation class and load the document

New Presentation (); Ppt. LoadFromFile (@ "C:\Users\Administrator\Desktop\sample.pptx");

Step two: Get the third slide, extract and save the picture

inti =0;foreach(IShape sinchPpt. slides[2]. Shapes) {if(s isslidepicture) {slidepicture PS= S asslidepicture; Ps. PictureFill.Picture.EmbedImage.Image.Save (string. Format ("{0}.png", i)); I++; }    if(s isPictureshape) {Pictureshape PS= S asPictureshape; Ps. EmbedImage.Image.Save (string. Format ("{0}.png", i)); I++; }}

The picture from the third slide you extracted is saved to the specified location

The above shows how to extract text and pictures, the steps are simple and practical, I hope to help you, thank you for reading!

Please specify the source if you want to reprint.

C # extract PPT text and picture implementation scheme

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.