C # uses SELENIUM+PHANTOMJS to crawl data

Source: Internet
Author: User

The project at hand needs to fetch data from a Web site that is rendered with JS. Using the usual httpclient to grab back the page is no data. Baidu on the Internet a bit, we recommend the plan is to use PHANTOMJS. PHANTOMJS is a WebKit browser with no interface, and can use JS rendering page consistent with browser effect. Selenium is a Web testing framework. Use selenium to operate PHANTOMJS. But the online example is mostly python. Helpless, downloaded the Python follow the tutorial to do a bit, stuck in the selenium import problem. Then give up, or use their usual C # bar, do not believe that C # did not. After half an hour of tossing and fix (python toss for one hours). Record this blog post, let me wait for the new C # Novice can use the PHANTOMJS.

First step: Open a new console project in Visual Studio 2017 and open the NuGet Package Manager.

Part II: Search Selenium, install Selenium.webdriver. Note: If you want to use a proxy, it is best to install version 3.0.0.

Step three: Write down the code as shown. But the execution of the time will be error. The reason is that PhantomJS.exe cannot be found. This time you can download one, or you can continue to see the fourth step.

usingOpenqa.selenium;usingOpenQA.Selenium.PhantomJS;usingSystem;namespaceconsoleapp1{classProgram {Static voidMain (string[] args) {            varURL ="http://www.baidu.com"; Iwebdriver Driver=NewPhantomjsdriver (Getphantomjsdriverservice ()); Driver. Navigate ().            Gotourl (URL); Console.WriteLine (Driver.            Pagesource);        Console.read (); }        Private StaticPhantomjsdriverservice Getphantomjsdriverservice () {phantomjsdriverservice PDS=Phantomjsdriverservice.createdefaultservice (); //set proxy server address//PDS.              Proxy = $ "{ip}:{port}"; //setting Proxy Server Authentication information//PDS. Proxyauthentication = Getproxyauthorization ();            returnPDS; }    }}

Fourth step: Open NuGet to install the Selenium.PhantomJS.WebDriver package.

Fifth step: Run. You can see that the Phantomjs.exe is downloaded automatically.

OK, so that you can start your data capture great cause.

C # uses SELENIUM+PHANTOMJS to crawl data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.