Current website Mainstream loading mode:
One is synchronous loading, the other is asynchronous loading, which is what we often say with Ajax. For Web sites that are loaded synchronously, the normal crawler can easily be done. But for a Web site that asynchronously requests data, it is usually done using the selenium+phantomjs combination.
(1) Selenium: is a Web Automation testing tool, originally developed for Web site automation testing, it can be used to manipulate the browser and the elements of the Web page. Selenium supports most browsers, similar to phantomjs browser with no interface
(2) Phantomjs: is a webkit -based no interface browser, in addition to no interface, and other functions as normal browser. Because there is no interface, so the efficiency is higher than the average browser
(3) Casperjs is an open source navigation script processing and testing tool, written based on the Phantomjs(front end Automated test tool). Casperjs simplifies the process definition of a complete navigation scene and provides useful advanced functions, methods, and syntax for accomplishing common tasks
1. Download:
(1) Phantomjs:http://phantomjs.org/download.html
(2) Casperjs:http://casperjs.org/
2. Install (unzip, configure environment variables):
Decompression is needless to say, all know that after extracting the bin directory to add to the environment variable path, I use WIN10, such as:
3. Verify that the configuration is successful:
CMD to execute the command:
Phantomjs--version
Casperjs--version
Such as:
PHANTOMJS, CASPERJS installation configuration graphic detailed