PHANTOMJS Quick Start

Source: Internet
Author: User
Tags jquery library

This paper briefly introduces the basic knowledge points of PHANTOMJS, including the introduction of PHANTOMJS, download and installation, HelloWorld program, and the introduction of core modules. Due to my humble caishuxueqian, there are inevitably omissions, welcome to correct the exchange.

1. What is PHANTOMJS?

PHANTOMJS is a WebKit-based JavaScript API. It uses Qtwebkit as the function of its core browser, using WebKit to compile and interpret the execution of JavaScript code. Anything you can do on a WebKit browser can do it. Not only is it an invisible browser, it provides such things as CSS selectors, web standards support, DOM manipulation, JSON, HTML5, Canvas, SVG, and so on, as well as handling file I/O, so you can read and write files to the operating system. Phantomjs is useful for a wide range of applications, such as network monitoring, Web screenshots, Web tests without browsers, and page access automation.

PHANTOMJS official address: http://phantomjs.org/.

PHANTOMJS official api:http://phantomjs.org/api/.

PHANTOMJS Official example: http://phantomjs.org/examples/.

Phantomjs github:https://github.com/ariya/phantomjs/.

2, Phantomjs download and installation

Official: http://phantomjs.org/download.html. Currently, the official support of three operating systems, including Windows\mac Os\linux, the three major mainstream environment. You can choose the package to download according to your running environment, my running environment is Windows7.

After the download is complete, unzip the file, it is recommended for easy use, put in a folder separately, such as I put in D:\WORKSPACE\PHANTOMJS.

Here you have successfully downloaded and installed the PHANTOMJS. Then, open the D:\workspace\phantomjs\bin folder, double-click Run Phantomjs.exe, the following interface appears, then you can run the JS code.

Since we are all relatively lazy, do not like to run a program always run to the D:\workspace\phantomjs\bin folder to open Phantomjs.exe. Then, you can add Phantomjs.exe to the environment variable. The following: Open My Computer, right-click Properties, advanced system settings, advanced label environment variables, find path in system variables, add your PHANTOMJS to environment variables. For example, my path is added as ";D: \workspace\phantomjs\bin", remember not to lose the previous semicolon.

3, the first PHANTOMJS small program HelloWorld

Well, so far, we can start our first PHANTOMJS program. Open your working directory, create a new file Hello.js, and Ctrl+s save it by typing the following code:

1//A PHANTOMJS Example 2 var page = require (' webpage '). Create (); 3 phantom.outputencoding= "GBK"; 4 Page.open ("Http://www.cnblogs.com/front-Thinking", function (status) {5    if (status = = = "Success") {6       console . log (page.title);  7    } else {8       console.log ("page failed to load.");  9    }10    phantom.exit (0); 11});

Then, open the CMD command line tool, switch to your current directory, and typing Phantomjs hello.js, the results are as follows:

If your results are the same as mine, then congratulations, you have successfully run up to your first PHANTOMJS program. So let's briefly introduce the following code: line 2nd, webpage is one of the core modules of PHANTOMJS, which gives the user an interface to access, manipulate, and select Web documents. Line 3rd, set the encoding format, or the output may be garbled. Line 4th, run the Page.open function, where the first parameter is the URL you want to access, and the second parameter is a callback function. In the callback function we check the status of the next return, if it is success then we will browse the URL of the document title printed out, as you can see, if not so printed document loading error. The last line exits the PHANTOMJS execution environment.

4. PHANTOMJS Core API

Webpage: As you can see, we have seen the power of the above example. Its role is primarily to provide a set of core methods to access and manipulate Web documents, including manipulation of the DOM, event capture, user event simulation, and so on.

System: This module provides some operating system-related interfaces, such as accessing operating system information, accessing system environment variables, accepting command-line parameters, and so on, related to program execution.

FS: That is filesystem. Familiar with Nodejs's friends know, Nodejs also built the relevant core modules. FS provides a standard interface for performing file I/O operations, such as reading and writing files, deleting files, and so on. It makes it very easy for you to persist some files (such as logfile, etc.).

Webserver: Like its name, you can implement your own webserver based on it to handle requests and execute PHANTOMJS code.

For some other configuration information, the command format for executing PHANTOMJS is as follows:

1 PHANTOMJS [Switches] [options] [script] [argument [argument [...]]

Among them, various parameters are optional. For example, the execution command for our first program is as follows:

1 Phantomjs hello.js

Turn on debug mode (this mode is for development and provides the necessary information):

1 Phantomjs--debug=yes Hello.js

To set the cookie path:

1 Phantomjs--cookie-file=cookie.txt Hello.js

5. Operation Page Content

In HelloWorld we have learned how to access a URL and remove its title. Let's look at how to select and manipulate DOM elements:

Dom selectors, commonly used getElementById, Getelementbyclassname, Getelementbyname, Getelementbytagname, Queryselector (CSS selectors).

Let's look at an example of using Queryselector:

1 var content = page.evaluate (function () {2 var element = Document.queryselector (' #elem '); 3     return Element.textconte Nt;4}); 5 console.log (content);

Evaluate function is a new thing, in fact, it is very simple, in the webpage environment to execute evaluate incoming callback function, in this case, the implementation of phantom-related operations can avoid the Web page spying phantom related settings information. The above code is relatively simple, not verbose.

Impersonate a user Click event:

PHANTOMJS provides two interfaces for simulating click events, one for SENDEVENT,PHANTOMJS event triggers and one for DOM event triggers.

Let's look at the first one, with the following syntax:

1 sendevent (EventType, point X, point Y, button= ' left ') 2 eventtype:mouseup  mousedown mousemove Click Doubleclick3 P Oint x: The x-coordinate of the triggering event 4 point y: the y-coordinate of the triggering event

The second, we should all be more familiar with:

1 var evt = document.createevent ("mouseevents"); 2 evt.initmouseevent (3     "click",//Event Type 4     true,  5     True,  6     window,  7     1,  8     1 , 1, 1, 1,//event coordinates 9 false,//CTRL key identifies false,//ALT key identifies false,///     Shift key identifies     false,//meta key identification 13< C14/>0,//mouse left     ); The target element is element.dispatchevent (EVT);

6. Event Handling

In a real browser, any event can be seen, and it is not visible in the PHANTOMJS. In Phantomjs, we can capture these events and handle them accordingly. Since there are many kinds of events involved, today we just use a more useful event as an example, based on which you can monitor a page and make an analysis:

1 var startTime = null;2 page.onloadstarted = function () {3     startTime = new Date (). GetTime (); 4}

The monitor also does not start loading events to get the initial load time;

1 var resources = [];2 page.onresourcerequested = function (Request) {3     resource = {4         "StartTime": request.time,5< c2/> "url": request.url6     };7     resources[request.id] = resource;8};

Listen for resource file request event, get the time of resource initiating request;
1 page.onresourcereceived = function (response) {2     if (response.stage = = "Start") {3         resources[response.id]. Size = Response.bodysize;4     } else if (Response.stage = = "End") {5         resources[response.id].endtime = Response.time;6     }7};

Listen for the load completion event of the resource file, get the loading finish time;
1 page.onloadfinished = function () {2     endTime = new Date (); 3     timeinseconds = (endtime-starttime)/+; 4
   console.log ("Loading takes" + TimeInSeconds + "seconds."); 5     Resources.foreach (function (Resource) {6         st = new Date (resource.starttime). GetTime (); 7         et = new Date ( resource.endtime). GetTime (); 8         timespent = (et-st)/9         Console.log (timespent + "seconds:" + resource.url);     PHANTOM.E XIT (0); 12};
Listen for document load completion events, record completion times, and print out all resource files for time-consuming.

Above the on+ event, do four things, listen to the resource file request and load completion events, listen to the document loading start completion events, get the corresponding time, so that we can use these events to analyze the performance of this page.

7. Crawl Page

The page fetch to be accessed is saved as a picture or PDF file, which is very simple in phantomjs. Here's an example of saving images and PDFs, respectively:

Save as Picture:

1//A PHANTOMJS example, saved as img 2 var page = require (' webpage '). Create (); 3 Page.open ("http://www.cnblogs.com/front-Thinking/", function (status) {4    if (status = = = "Success") {5       Consol E.log (page.title);  6       Page.render ("Front-thinking.png"); 7    } else {8       console.log ("page failed to load.");  9    }10    phantom.exit (0); 11});

Note: Render gets a parameter that is the file name of the saved file. The results are as follows:

Save as PDF:

1//A PHANTOMJS example,saved as PDF file 2 var page = require (' webpage '). Create (); 3 Page.open ("http://www.baidu.com", function (status) {4    if (status = = = "Success") {5       console.log (page.title);  6       page.papersize = {format: ' A4 ',  7             Orientation: ' Portrait ',  8             border: ' 1cm '}; 9       Page.render ("Front-thinking.pdf"), or    else {one       console.log ("page failed to load.");    }13    Phantom.exit (0); 14});

Note: The format of the PDF is set in Pagersize. The results are as follows:

With these features, you can be a crawler to crawl people's websites.

8, file operation related

File manipulation is very useful in coding, for example, you can put some configuration information in the file, in the process of executing the program to read; You can also save some useful information in your program execution as a file. Therefore, file I/O is useful. Let's give a simple example to read the file information:

1 var filePath = '/workspace/file1.js ';//file path 2  3//Determine if the file exists, whether it is a file or a folder 4 if (Fs.exists (FilePath) && Fs.isfile ( FilePath) {5        var ins = Fs.open (FilePath, ' r ');//Open File 6 while         (!ins.atend ()) {//loop read file contents 7            var buffer = INS.R Eadline ();//Line read 8            console.log (buffer); 9        }10}

Here, read the contents of the file and print it row by line. File operations are available in the following ways:

1 R      //Read file 2 W    //write file, back overwrite 3 a     //write file, append 4 RB//    read binary stream 5 rw    //write binary stream

9, modular

Modularity is not a technology involved in the content, here does not do a detailed introduction. Refer to Nanyi's blog for details: http://www.ruanyifeng.com/blog/2012/10/javascript_module.html

10. Integration with jquery and other third parties

There are many third-party very good library functions, then here we will give a very much like the jquery library function to talk about the PHANTOMJS and library functions. The code is as follows:

1 var page = require (' webpage '). Create (); 2 Page.open ("http://www.cnblogs.com/front-Thinking/", function (status) {3     if (status = = = "Success") {4         PAGE.R Ender ("Before.png"); 5         Page.includejs ("Http://code.jquery.com/jquery-1.10.1.min.js",  6             function () {7                 page.evaluate ( function () {8                 $ (' #Header1_HeaderTitle '). html (' My Phantomjs '); 9             });             page.render ("After.png"); 11             phantom.exit ();(););     }14});

The above code, access to my blog address, and grab screenshots, after loading jquery to modify the title of my blog, the results are as follows:

  

Before.png

    

After.png

11. Other

Phantomjs can do so much so that I may have introduced only one of its n points, and n tends to infinity. Said is only the introduction of the post, so no more in-depth introduction down, of course, I was only a small white, temporarily know the more obvious. In fact, PHANTOMJS can be combined with Jasmine to do the test, can save a lot of manpower and time costs. At the same time, the open source community has a lot of tools and applications based on PHANTOMJS, such as front-end crawlers, interested to read.

PHANTOMJS Quick Start

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.