The previous article uses Java regular expressions to judge and get the link to the image and the URL of the jump, this article uses Selenium's own API (GetAttribute) to get the content specified in the page
Implementation content: Get the link address and jump address of all images below, use GET request to determine if there is a dead chain
Page content
The source of the page, need to get the page's href address, as well as the post-src address,:
The implementation of the code can be seen in a DIV, the realization of the idea is: Get the control set, in the acquisition of each li under the element, in the fetch, the value of the property name under the Fetch data
Public voidNew_classification ()throwsException {op.loopget (home,40, 3, 60); Op.loopclickelement ("Swimmer", 3, ten, explicitwaittimeoutloop);//go to a page if(Driver.getcurrenturl (). Contains ("Swimwear") ) {List<WebElement> newimage = driver.findelements (By.xpath ("//*[@id = ' js_prolist ']/ul/li"));//a collection of controls for a picture for(inti = 0; I < newimage.size (); i++) {String contentURL= Newimage.get (i). Findelement (By.xpath ("p[1]/a[1]"). getattribute ("href");//The jump address of the pictureString ImageURL = Newimage.get (i). Findelement (By.xpath ("p[1]/a[1]/img")). getattribute ("src");//the link address of the picturePub.get (contentURL); System.out.println ("**********************"); Pub.get (ImageURL);//GET Request } } Else{log.logerror ("No access to the new page"); } }
Results show
If you need to use regular expressions, view the article: http://www.cnblogs.com/chongyou/p/7286447.html
Use selenium to get a link to a picture in a webpage and a link to a webpage to determine if it is a dead chain (ii)