C # convert the webpage to PDF

Source: Internet
Author: User
Recently, I encountered a task of converting HTM to PDF. This is a useful feature block, but unfortunately, there is no ready-to-use solution on the Internet (including open-source/free, easy-to-use and maintainability considerations. Since there is no ready-made solution, you can solve it yourself.
Generating a PDF file from HTM can be implemented in two steps. The first step is to parse the HTM file and convert the text in the HTM source file to the graphic result that the browser finally presents to us. This is an unfinished task, because so far no software giant in the industry has done a good job in HTM parsing. Comparing the display results of IE, Firefox, and other browsers can be imagined. Since the industry is difficult, I will not go into technical difficulties. I will skip this step and consider the next step.
Step 2: Draw a PDF file. This is simple. There is a lot of information on the Internet. If you are interested, you can study the PDF file format and install the binary Assembly PDF file. I am interested, but I don't have time. I think software practitioners should always pay attention to the most valuable things. The first method for software practitioners to improve efficiency is reuse. There is something on the Internet called itextsharp that is used to draw PDF files and can be used for free and open-source.
Download itextsharp and try to use itextsharp to draw HTM to see the effect. As expected, HTM's Source code . Because we have not solved the first step, we will solve the first step.
I remember seeing a Web page snap tool written by. net a long time ago. The general idea is to use the drawtobitmap method of webbrowser to output the IE display result to the sytem. Drawing. Bitmap object. Approximate Code As follows: // Webbrowser WB = NULL;
System. Drawing. bitmap BMP =   New System. Drawing. Bitmap (W, H );
WB. drawtobitmap (BMP, New System. Drawing. rectangle ( 0 , 0 , W, h ));

OK, HTM can be parsed. NowRestructuring just nowCode, thinkingPath:
EnableUse webbrowser to parse and convert HTM into an image, and use itextsharp to plot the imagePDF.
It is a function developed for the company. It is inconvenient to publish the source code for the moment. It provides compiled tools for download and use. You can also customize them based on the above ideas:
Usage,
1. convert a single URL to PDF: pageto0000.exe "http://www.g.cn/" google.jpg"
2. Convert multiple URLs to PDF: pagetow..exe task.txt "C: \ 20.dir \"
Task.txt is a table in the task, which provides multi-line URLs. Each URL is suffixed with # file name, for example, http://www.baidu.com/? B = http://www.baidu.com/ B (the extension system appends itself)
Use in Asp.net Environment
Upload pagetopdf to the website and set the directory permission. Sample Code:

Code

Public   Static   Bool Createppdf ( String URL, String Path)
{
Try
{
If ( String . Isnullorempty (URL) |   String . Isnullorempty (PATH ))
Return   False ;
PROCESS p =   New Process ();
String Str = System. Web. httpcontext. Current. server. mappath ( " ~ /Afafasf/pageto0000.exe " );
If ( ! System. Io. file. exists (STR ))
Return   False ;
P. startinfo. filename = STR;
P. startinfo. Arguments =   " \ "" + URL + " \ "   "   + Path;
P. startinfo. useshellexecute =   False ;
P. startinfo. redirectstandardinput =   True ;
P. startinfo. redirectstandardoutput =   True ;
P. startinfo. redirectstandarderror =   True ;
P. startinfo. createnowindow =   True ;
P. Start ();
System. Threading. thread. Sleep ( 500 );
Return   True ;
}
Catch (Exception ex)
{
SYS. log. Error ( " PDF create err. " , Ex );
}
Return   False ;
}

feature
refers to the worker process, which is started by the System Scheduling Program , to increase the processing speed of a task. The number of processes is controlled by the scheduler and cannot exceed 10.
pagetopdf

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.