C # Use Selenium + PhantomJS to capture data,
The project at hand needs to capture data from a website rendered with js. There is no data on the page captured by using the commonly used httpclient. After surfing Baidu, we recommend using PhantomJS. PhantomJS is a webkit browser with no interface. It can use js to render pages in the same effect as the browser. Selenium is a web Testing Framework. Use Selenium to operate out-of-box PhantomJS. However, most examples on the Internet are Python. However, I downloaded python and followed the tutorial to solve the import problem of Selenium. If you give up, use your usual c #. You don't believe that c # does not exist. After half an hour of hard work (python has been tossing for an hour ). Record this blog post so that new users who are working on c # can use PhantomJS.
Step 1: Open visual studio 2017 to create a console project and open nuget Package Manager.
Part 2: Search for Selenium and install Selenium. WebDriver. Note: If you want to use a proxy, you 'd better install version 3.0.0.
Step 3: Write down the code shown in. However, an error is reported during execution. The original cause is that phantomjs.exe is not found. In this case, you can download one or continue to step 4.
Using OpenQA. selenium; using OpenQA. selenium. phantomJS; using System; namespace ConsoleApp1 {class Program {static void Main (string [] args) {var url = "http://www.baidu.com"; IWebDriver driver = new PhantomJSDriver (GetPhantomJSDriverService ()); driver. navigate (). goToUrl (url); Console. writeLine (driver. pageSource); Console. read ();} private static PhantomJSDriverService GetPhantomJSDriverService () {PhantomJSDriverService pds = PhantomJSDriverService. createdefaservice Service (); // sets the proxy server address // PSP. proxy = $ "{ip }:{ port}"; // sets the authentication information of the Proxy server // tp.dita. proxyAuthentication = GetProxyAuthorization (); return pd ;}}}
Step 4: Enable nuget to install the Selenium. PhantomJS. WebDriver package.
Step 5: Run. We can see that phantomjs.exe is automatically downloaded.
Now, we can start your data grabbing business.