Analysis on the principles of long weibo generation (converting html into images)

Source: Internet
Author: User
Tags imagemagick
There are some requirements in daily work. in simple terms, you need to generate images of some content. It is okay to process simple content through PhotoShop, but it is similar to content with tables. it is a waste of time to process it through PhotoShop every time. There are some requirements in daily work. in simple terms, you need to generate images of some content. It is okay to process simple content through PhotoShop, but it is similar to content with tables. it is a waste of time to process it through PhotoShop every time. There are many generation tools like long Weibo on the internet. it is okay to generate simple images. However, if you want to generate images using rich text, you need more money, so I made a research on PHP-based implementation.

Requirements and principles

Based on PHP, generate images (PNG, JPEG, etc.) from html content)

Implementation method

1. generate directly through graphic functions

You can directly use the self-contained GD library or imagick in PHP to convert text content into images. This is very powerful in processing plain text content, but it is very difficult for rich text content to handle. Currently, painty is open-source and supports simple html tags such as p and img.

2. html-> pdf-> png

In this way, the html content is first generated into a pdf document, and then the pdf document is converted into an image.

Html to pdf: mature solutions include tcpdf and HTML2PDF. In fact, HTML2PDF is also the kernel of tcpdf;

Pdf to png: it can be expanded through imagick php.

Currently, the open source code based on this method is html to image, as shown in figure.

Core code is (from: ):

// Obtain the content of a URL echo file_get_contents ('http: //'); // Convert the content to a pdf file $ html2pdf = new HTML2PDF ('P ', 'A4 '); $ html2pdf-> writeHTML ($ html_content); $ file = $ html2pdf-> output('tempting', 'F '); // Convert a pdf file to an image $ im = new imagick('tempte'); $ im-> setImageFormat ("jpg"); $ img_name = time().'.jpg '; $ im-> setSize (800,600); $ im-> writeImage ($ img_name); $ im-> clear (); $ im-> destroy ();

The HTML2PDF code is used here. In fact, I personally suggest using tcpdf. after all, the version of tcpdf is updated to provide more powerful functions. According to actual tests, tcpdf provides better support for Chinese and html formats. In contrast, HTML2PDF is a little ugly. for a long period of Chinese characters, some basic errors such as the inability to automatically wrap text will occur.

But at the same time, this method has a major defect. When an image or other media is inserted, it is often possible that one page cannot be inserted and needs to be re-typed on another page, as a result, the generated image will have a large blank area. if the content on each page is not fully filled, the generated image will also have a large blank area, which is very unattractive.

Therefore, this method is not recommended.

3. Pass

This method is similar to using the browser function, and directly performs the content of a URL address. Compared with the previous two methods: first, html content rendering for rich texts is more convenient and simple, and html code can be directly generated. second, content layout is more reasonable, there will be no blank areas in the pdf document. third, the Chinese support is more friendly.

Currently, major open-source projects include:

Khtml2png: based on the Linux platform, html can be converted into an image format, with the following requirements:

g++KDE 3.xkdelibs for KDE 3.x (kdelibs4-dev)zlib (zlib1g-dev)cmake

For servers, especially VPS with tight resources, installing a KED is a little expensive.

CutyCapt and its sibling version IECapt: the CutyCapt is based on Linux and Windows, and the IECapt is based on Windows. it supports svg, ps, pdf, itext, html, rtree, png, jpeg, mng, tiff, gif, bmp, ppm, xbm, xpm, and other formats are easy to use. use the following command directly.

Note: the executable commands of CutyCapt are case-insensitive on Windows and Linux.

./CutyCapt --url= --out=example.png IECapt --url=# --out=localfile.png

Its deployment requirements are:

CutyCapt depends on Qt 4.4.0+.

However, it is better than khtml2png because it does not need to install X server. you can use Xvfb as a lightweight object and then use it like this:

xvfb-run --server-args="-screen 0, 1024x768x24" ./CutyCapt --url=... --out=...

By comparing various implementation methods, I prefer the CutyCapt method.

By comparing various implementation methods, I prefer the CutyCapt method.

Implementation process

1. the rich text editor is embedded to provide rich text editing functions, as well as customization of author information, copyright tags, and image size formats.

2. filter the submitted content, generate an htm/html document, and render the generated document content in the format through CSS.

3. run the CutyCapt command in PHP to generate a webpage file.

At this step, the html content generation function can be fully implemented, but the image generated by CutyCapt is relatively large, so further optimization can be performed.

4. use imagick to optimize the generated image

Imagick has powerful image processing functions, which can optimize the quality and size of the images generated by CutyCapt, and facilitate watermarking and other operations.

Development Experience Sharing

I encountered various problems in the actual development process and shared some issues.

1. Operating System Selection

CutyCapt and imagick both have Linux and Windows versions. there is no major problem in the development and running of Windows. follow the normal steps to install and configure them.

For more information about how to install CutyCapt on Linux, see

Install cutycapt in centos:

(1) install qt47

Add source of qt47

Vim/etc/yum. repos. d/atrpms. repo // add the following content [atrpms] name = CentOS $ releasever-$ basearch-ATrpmsbaseurl = repository $ releasever-$ basearch-ATrpms testingbaseurl = paiupdateyum install qt47yum install qt47-develyum install qt47-webkityum install qt47-webkit-devel install

2. install cutycapt

yum install svnsvn co cutycapt/CutyCapt /usr/local/cutycaptcd /usr/local/cutycaptqmakeqmake-qt47

3. install xvfb

yum install Xvfb

4. test cutycapt

xvfb-run --server-args="-screen 0, 1024x768x24" CutyCapt --url= --out=php.png

5. put xvfb into the background for running

Xvfb -fp /usr/share/fonts :0 -screen 0 1024x768x24 &DISPLAY=:0 ./CutyCapt --url= --out=php.png

Install cutycapt in ubuntu

1. get the two commands

apt-get install cutycaptapt-get install xvfb

2. test

xvfb-run --server-args="-screen 0, 1024x768x24" CutyCapt --url= --out=php.png

Chinese garbled characters:

Upload the Chinese font in windows to the/usr/share/fonts directory and run the fc-cache command.

Here, the author wants to say that you should select Ubuntu for installation. More importantly, there will be various problems in CentOS, such as CutyCapt: cannot connect to X server: 99 and so on will make you very depressed. I even installed a new operating system that contains Gnome and KDE desktop environments, but there is almost no problem in Ubuntu.

2. Web server selection

Because the function involves PHP which needs to execute the CutyCapt command of the operating system, you can use the system () or exec () function.

The authors use apache and Nginx Web servers respectively. in Nginx, PHP scripts that call CutyCapt may fail to run, which may cause troublesome permissions. Using the apache server is smooth. this problem does not exist.

Therefore, the author suggests selecting a combination of Ubuntu + apache. do not select CentOS + Nginx. there are too many problems to solve, which may also cause some insecure factors.

The installation code is as follows:

apt-get install apache2apt-get install php5 libapache2-mod-php5

3. install imagick in Ubuntu

Apt-get install php5-dev php5-cli php-pear // installation support environment apt-get install imagemagick // may not be the latest version, the wget xzvf ImageMagick-6.8.7-0.tar.gzcd/needs to be installed with the latest version of the source code /. /configure & make installapt-get install graphicsmagick-libmagick-dev-compatpecl install imagickecho extension = imagick. so>/etc/php5/conf. d/imagick. iniservice apache2 restart

Common errors:

The following error message is displayed when you run pecl install imagick:

Checking if ImageMagick version is at least 6.2.4... configure: error: no. you need at least Imagemagick version 6.2.4 to use Imagick. ERROR: '/tmp/pear/temp/imagick/configure -- with-imagick = hjw' failed

If you do not have Imagemagick installed or the Imagemagick version is insufficient, you can install the latest Imagemagick version in source code mode.

4. font rendering in Linux

You can install common Chinese fonts such as, ,, and on Windows to the Ubuntu system to avoid poor font reading, it also supports rendering of fonts supported in rich text editing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.