This is a small demand encountered in the first half, want to implement the page crawl, and save as a picture. Studied a lot of tools, the effect is not ideal, not the display is too poor (Canvas, Html2image, Cobra), is not good performance (such as SWT's Brower). After discovering that no interface browser can meet this condition, roughly study the PHANTOMJS and Cutycapt, both are WebKit kernels, which phantomjs more convenient to use, especially on the Windows platform, if under Linux, From the 2.0 version of the need to go to the machine to compile (about 3 hours to compile, I have to say, g++ is a slag, the same project, under the VC compiled quickly, do not talk about, after all, is free open source compiler). The following is a PHANTOMJS of web technologies implemented with Java code:
First, the Environment preparation
1. Directory of PHANTOMJS scripts:d:/xxx/phantomjs-2.0.0-windows/bin/phantomjs
2. Script: D:/xxx/phantomjs-2.0.0-windows/bin/rasterize.js
The script is available on the official website, but here I need to explain its high-width design principle:
Page.viewportsize = {width:600, height:600};
This is the default height, that is, 600x600, I suggest you set the height of a smaller, my side set is width:800,height:200. Because, in fact, when setting the height and brightness in different situations, if the real Web page height is greater than the set value, the picture will automatically expand the high-width, until the entire page is displayed (when you want to intercept the small picture, it may be because the default setting is too large, it will make the picture a lot of empty). If you set a high width at the same time, the following code will be executed, and the part of the page will be intercepted:
Page.cliprect = {top:0, left:0, Width:pagewidth, height:pageheight};
3, first test with the command line:
D:/XXX/PHANTOMJS-2.0.0-WINDOWS/BIN/PHANTOMJS d:/xxx/phantomjs-2.0.0-Windows/bin/rasterize.js/http Www.qq.com D:/test.png
If it's configured, you should see the resulting picture. Of course, you can also configure high-width parameters, after the above command to add: "1000px" or "1000px*400px", are OK.
Second, the server code
As a Web service, this part of the code should be sent to the server, of course, do not have to copy all, according to their own needs to use it:
1 Packagelekkoli.test;2 3 ImportJava.io.BufferedInputStream;4 ImportJava.io.BufferedReader;5 ImportJava.io.ByteArrayOutputStream;6 ImportJava.io.File;7 ImportJava.io.FileInputStream;8 Importjava.io.IOException;9 ImportOrg.apache.log4j.Logger;Ten One /** A * page to Picture processing class, using external cmd - * @authorLekkoli - */ the Public classPhantomtools { - - Private Static FinalLogger _logger = Logger.getlogger (phantomtools.class); - + //private static final String _temppath = "/data/temp/phantom_"; - //private static final String _shellcommand = "/usr/local/xxx/phantomjs/usr/local/xxx/rasterize.js"; Commands under Linux + Private Static FinalString _temppath = "D:/data/temp/phantom_"; A Private Static FinalString _shellcommand = "D:/xxx/phantomjs-2.0.0-windows/bin/phantomjs d:/xxx/phantomjs-2.0.0-windows/bin/ Rasterize.js "; at - PrivateString _file; - PrivateString _size; - - /** - * Construction Class in * @parm Hash is used for temporary file directory uniqueness - */ to PublicPhantomtools (inthash) { +_file = _temppath + hash + ". png"; - } the * /** $ * Construction ClassPanax Notoginseng * @parm Hash is used for temporary file directory uniqueness - * @paramthe size of a picture, such as 800px*600px (which is cut at this height), or 800px (at this point the height is minimal = width *9/16, height is not trimmed) the */ + PublicPhantomtools (inthash, String size) { A This(hash); the if(Size! =NULL) +_size = "" +size; - } $ $ /** - * Convert target page to picture byte stream - * @paramURL Destination page address the * @returnByte stream - */Wuyi Public byte[] getbyteimg (String URL)throwsIOException { theBufferedinputstream in =NULL; -Bytearrayoutputstream out =NULL; WuFile File =NULL; - byte[] ret =NULL; About Try { $ if(Execmd (_shellcommand + URL + "" + _file + (_size! =NULL? _size: ""))) { -File =NewFile (_file); - if(File.exists ()) { -out =NewBytearrayoutputstream (); A byte[] B =New byte[5120]; +in =NewBufferedinputstream (Newfileinputstream (file)); the intN; - while((n = in.read (b, 0, 5120))! =-1) { $Out.write (b, 0, n); the } the File.delete (); theRET =Out.tobytearray (); the } -}Else { inRET =New byte[] {}; the } the}finally { About Try { the if(Out! =NULL) { the out.close (); the } +}Catch(IOException e) { - _logger.error (e); the }Bayi Try { the if(In! =NULL) { the in.close (); - } -}Catch(IOException e) { the _logger.error (e); the } the if(File! =NULL&&file.exists ()) { the File.delete (); - } the } the returnret; the }94 the /** the * Execute cmd command the */98 Private Static Booleanexecmd (String commandstr) { AboutBufferedReader br =NULL; - Try {101Process p =runtime.getruntime (). exec (COMMANDSTR);102 if(P.waitfor ()! = 0 && p.exitvalue () = = 1) {103 return false;104 } the}Catch(Exception e) {106 _logger.error (e);107}finally {108 if(BR! =NULL) {109 Try { the br.close ();111}Catch(Exception e) { the _logger.error (e);113 } the } the } the return true;117 }118}
Using the Phantomtools class above, it is convenient to call the Getbyteimg method to generate and retrieve the contents of the picture.
Attach my configuration script: Rasterize.js, as for PHANTOMJS, we will go to the official website to download it.
Reprint Please specify original site:http://www.cnblogs.com/lekko/p/4796062.html
Using PHANTOMJS to implement Web services