This paper mainly introduces the method of using SELENIUM+PHANTOMJS to fetch data in C #, which has a good reference value, and then look at it together with the small series.
The project at hand needs to fetch data from a Web site that is rendered with JS. Using the usual httpclient to grab back the page is no data. Baidu on the Internet a bit, we recommend the plan is to use PHANTOMJS. PHANTOMJS is a WebKit browser with no interface, and can use JS rendering page consistent with browser effect. Selenium is a Web testing framework. Use selenium to operate PHANTOMJS. But the online example is mostly python. Helpless, downloaded the Python follow the tutorial to do a bit, stuck in the selenium import problem. Then give up, or use their usual C # bar, do not believe that C # did not. After half an hour of tossing and fix (python toss for one hours). Record this blog post, let me wait for the new C # Novice can use the PHANTOMJS.
First step: Open a new console project in Visual Studio 2017 and open the NuGet Package Manager.
Part II: Search Selenium, install Selenium.webdriver. Note: If you want to use a proxy, it is best to install version 3.0.0.
Step three: Write down the code as shown. But the execution of the time will be error. The reason is that PhantomJS.exe cannot be found. This time you can download one, or you can continue to see the fourth step.
Using openqa.selenium;using openqa.selenium.phantomjs;using System;namespace consoleapp1{class Program {static void Main (string[] args) { var url = "Http://www.baidu.com"; Iwebdriver Driver = new Phantomjsdriver (Getphantomjsdriverservice ()); Driver. Navigate (). Gotourl (URL); Console.WriteLine (Driver. Pagesource); Console.read (); } private static Phantomjsdriverservice Getphantomjsdriverservice () { Phantomjsdriverservice PDS = Phantomjsdriverservice.createdefaultservice (); Set the proxy server address //pds. Proxy = $ "{ip}:{port}"; Set Proxy server authentication information //pds. Proxyauthentication = Getproxyauthorization (); return PDS; } }}
Fourth step: Open NuGet to install the Selenium.PhantomJS.WebDriver package.
Fifth step: Run. You can see that the Phantomjs.exe is downloaded automatically.
OK, so that you can start your data capture great cause.