Using PHANTOMJS to implement Web Capture services

Source: Internet
Author: User

Use PhantomjsImplementing Web Services

2015-12-12 Source:Java Tutorials Popularity: 99

This is a small demand encountered in the first half, want to implement the page crawl, and save as a picture. Studied a lot of tools, the effect is not ideal, not the display is too poor (Canvas, Html2image, Cobra), is not good performance (such as SWT's Brower). After discovering that no interface browser can meet this condition, roughly study the Phantomjs and Cutycapt, both are WebKit kernels, which Phantomjs more convenient to use, especially on the Windows platform, If under Linux, from the 2.0 version after the need to go to the machine to compile (about to compile 3 hours, it must be said, g++ is a slag, the same project, VC compiled fast, not talk about, after all, is free open source compiler). The following is a phantomjs of Web technologies implemented with Java code:

First, the Environment preparation

1. Directory ofPhantomjs scripts: d:/xxx/phantomjs-2.0.0-windows/bin/phantomjs

2. Script: d:/xxx/phantomjs-2.0.0-windows/bin/rasterize.js

The script is available on the official website, but here I need to explain its high-width design principle:

Page.viewportsize = {width:600, height:600};

This is the default height, that is, 600x600, I suggest you set the height of a smaller, my side set is width:800,height:200. Because, in fact, when setting the height and brightness in different situations, if the real Web page height is greater than the set value, the picture will automatically expand the high-width, until the entire page is displayed (when you want to intercept the small picture, it may be because the default setting is too large, it will make the picture a lot of empty). If you set a high width at the same time, the following code will be executed, and the part of the page will be intercepted:

Page.cliprect = {top:0, left:0, Width:pagewidth, height:pageheight};

3, first test with the command line:

d:/xxx/phantomjs-2.0.0-windows/bin/phantomjs d:/xxx/phantomjs-2.0.0-windows/bin/ Rasterize.js http://www.QQ.com D:/test.png

If it's configured, you should see the resulting picture. Of course, you can also configure high-width parameters, after the above command to add: "1000px" or "1000px*400px", are OK.

Second, the server code

As a Web service, this part of the code should be sent to the server , of course, do not have to copy all, according to their own needs to use it:

1 package lekkoli.test; 2 3 ImportJava. Io.  Bufferedinputstream; 4 ImportJava. Io.  BufferedReader; 5 ImportJava. Io.  Bytearrayoutputstream; 6 ImportJava. Io.  File; 7 ImportJava. Io.  FileInputStream; 8 ImportJava. Io.  IOException;  9 Import Org.apache.log4j.Logger; 10 11/** 12 * page to Image processing class, using external cmd * @author Lekkoli */public class Phantomtools { nal Logger _logger = Logger.getlogger (Phantomtools.class); //private static final String _temppath = "/data/temp/phantom_"; //private static final String _shellcommand = "/usr/local/xxx/Phantomjs/usr/local/xxx/rasterize.js "; Linux commands under the private static final String _temppath = "D:/data/temp/phantom_"; private static final String _shellcommand = "d:/xxx/Phantomjs-2.0.0-windows/bin/Phantomjsd:/xxx/Phantomjs-2.0.0-windows/bin/rasterize.js "; _file private String; Private String _size;         26 27/** 28 * Construction class * @parm hash directory unique for temporary files * */public phantomtools (int hash) {32 _file = _temppath + hash + ". png"; 33} 34 35/** 36 * Construction Class PNS * @parm Hash the directory unique for temporary files * @param size of a size picture, such as 800px*600px (this height will be trimmed),         or 800px (at this time the height of the minimum = width *9/16, height does not cut) the * * * public phantomtools (int hash, String size) {(hash); 42 if (size! = NULL) _size = "" + size; 44} 45 46/** 47 * Convert destination page to picture byte stream * @param URL Destination page address * @return Byte stream * * Byte[] getbyteimg (String URL) throws IOException {bufferedinputstream in = null; EAM out = null; The file file = null; byte[] ret = null; Execmd (_shellcommand + URL + "" + _file + (_size! = null?)     _size: ""))) {58            File = new file (_file); if (file.exists ()) {out = new Bytearrayoutputstream (); Yte[] B = new byte[5120]; Bufferedinputstream in = new FileInputStream (file); int n;                     0 while ((n = in.read (b,, 5120))! =-1) {out.write (b, 0, N); 66 } file.delete (); ret = Out.tobytearray ();  ---}--Else {--ret = new byte[] {};             {Out.close (out! = NULL) {$ 77} 78                 } catch (IOException e) {_logger.error (e); n} Bayi try {82 if (in! = null) {in.close (); + +} catch (IoexceptiOn e) {_logger.error (e);} (file = null && file.exists ()) { File.delete (); * * * * * the return ret;         93} 94 95/** 96 * Execute CMD Command */98 private static Boolean Execmd (String commandstr) {99             BufferedReader br = null;100 try {101 Process p = runtime.getruntime (). exec (COMMANDSTR); 102 if (p.waitfor ()! = 0 && p.exitvalue () = = 1) {103 return false;104}105} cat                 CH (Exception e) {106 _logger.error (e); 107} finally {108 if (BR! = NULL) {109 try {br.close (); 111} catch (Exception e) {_logger.er Ror (e); 113}114}115}116 return true;117}118}

Using the Phantomtools class above, it is convenient to call the Getbyteimg method to generate and retrieve the contents of the picture.

Attach my configuration script: Rasterize.js, as for Phantomjs, we will go to the official website to download it.

Reprint Please specify original site: http://www.cnblogs.com/lekko/p/4796062.html

Using PHANTOMJS to implement Web services

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.