Summary of reading PDF text under. net

Source: Internet
Author: User

There are two main class libraries used to read PDF text in. Net: product_box and itextsharp.

Let's talk about consumer box. This class library is said to be very powerful. Here I will just give a brief introduction:

1. Download product_box

: Http://sourceforge.net/projects/pdfbox/

2. Reference Dynamic Link Library

Decompress the downloaded export box and find the bin directory. You need to add the referenced DLL files in the project:
Ikvm. GNU. classpath. dll
PDFBox-0.7.3.dll
FontBox-0.1.0-dev.dll
Ikvm. runtime. dll
To reference the above four files to a project, you must introduce the following two namespaces to the file:
Using orgdomainbox. pdmodel;
Using orgdomainbox. util;

3. Check the code for API usage:

 

[CSHARP]View plaincopy

  1. Using orgdomainbox. pdmodel;
  2. Using orgdomainbox. util;
  3. Public void cmd2txt (fileinfo file, fileinfo txtfile)
  4. {
  5. Pddocument Doc = pddocument. Load (file. fullname );
  6. Extends textstripper extends stripper = new extends textstripper ();
  7. String text = javasstripper. gettext (DOC );
  8. Streamwriter swpdfchange = new streamwriter (txtfile. fullname, false, encoding. getencoding ("gb2312 "));
  9. Swpdfchange. Write (text );
  10. Swpdfchange. Close ();
  11. }

Itextsharp is used to generate PDF files in many cases, but its ability to read PDF files is not bad. It is used as follows:

 

1. Download itextsharp

: Http://sourceforge.net/projects/itextsharp/

2. Reference Dynamic Link Library

Decompress the itextsharp-dll-core.zip file in the downloaded package to obtain itextsharp. dll. Add reference itextsharp. DLL to the project.
The following three namespaces must be introduced to the file:
Using itextsharp;
Using itextsharp. text;
Using itextsharp.text.pdf;

3. Check the code for API usage:

 

[CSHARP]View plaincopy

  1. Private string oncreated (string filepath)
  2. {
  3. Try
  4. {
  5. String pdffilename = filepath;
  6. Pdfreader = new pdfreader (pdffilename );
  7. Int numberofpages = pdfreader. numberofpages;
  8. String text = string. empty;
  9. For (INT I = 1; I <= numberofpages; ++ I)
  10. {
  11. Byte [] bufferofpagecontent = pdfreader. getpagecontent (I );
  12. Text + = system. Text. encoding. utf8.getstring (bufferofpagecontent );
  13. }
  14. Pdfreader. Close ();
  15. Return text;
  16. }
  17. Catch (exception ex)
  18. {
  19. Streamwriter wlog = file. appendtext (system. appdomain. currentdomain. setupinformation. applicationbase + "\ mylog. log ");
  20. Wlog. writeline ("error file:" + E. fullpath + "cause:" + ex. tostring ());
  21. Wlog. Flush ();
  22. Wlog. Close (); return NULL;
  23. }
  24. }
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.