Use Python + Selenium to implement a screenshot of the specified element of the page (truncated graph Element)

Last Update:2017-11-25 Source: Internet

Author: User

Tags truncated

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

to Webelement

The Webdriver.chrome method only takes a screenshot of the current window and cannot specify a specific element. If you need to intercept a specific element or a window over a screen, you can only go the other ways.

Webdriver.phantomjs's own method supports screenshots of the entire Web page.

Here are a few ideas.

Way One

For Webdriver.chrome

Through the Webdriver JS script injection function, the curve to salvation.

Inject third-party HTML-to-canvas JS libraries (see recommendations below)
Get element HTML source code
Convert HTML to Canvas
Download Canvas

Pros: Easy to capture long graphs

Cons: Loading third-party libraries time consuming, conversion principle please refer to this article:

Drawing DOM objects into the canvas

Mode two

For Webdriver.chrome

Capture full image, cut and splice by yourself

Get element position, size
Get window Size
To intercept a window containing elements
The corresponding cropping and stitching.

The concrete algorithm idea is clear, but needs attention more detail. This is not a repeat. For example code, please visit:

[Github] Pythonspiderlibs

Advantages: Do not need too much JS work, python+ a small number of JS code can be completed

Disadvantage: splicing and other work will be webdriver to achieve differences, picture loading speed and other factors, need to pay more attention. In the case of quality assurance, the speed is relatively slow

Way Three

For WEBDRIVER.PHANTOMJS

Due to the differences in interface implementations, PHANTOMJS can intercept the entire page compared to Chrome. So it's a lot easier to get the specified element.

Capture a full picture of a webpage
Crop the specified element

Driver =Webdriver. Chrome () Driver.get ('http://stackoverflow.com/') Driver.save_screenshot ('Screenshot.png') Left= element.location['x']top= element.location['y']right= element.location['x'] + element.size['width']bottom= element.location['y'] + element.size['Height']im= Image.open ('Screenshot.png') im=Im.crop (left, top, right, bottom)) Im.save ('Screenshot.png')

Advantages: Simple Implementation

Disadvantage: The height of the page will cause the file is too large, processing will be problematic, I test the maximum image size is 12.8M.

solve the problem of incomplete loading of picture

Reference: Automate fast with Python + Selenium

Let's first execute a JavaScript script on the homepage, drag the scroll bar of the page to the bottom, and then drag it back to the top, and finally. This solves the problem of loading images on demand like the one above.

#-*-coding:utf-8-*- fromSeleniumImportWebdriverImport TimedefTake_screenshot (URL, save_fn="Capture.png"):    #browser = webdriver. Firefox () # Get Local session of Firefox    #Google Browser intercepts the current window pageChromedriver = R"C:\soft\chromedriver2.31_win32\chromedriver.exe"Browser=Webdriver. Chrome (Chromedriver)#phantomjs intercept entire page    #browser = webdriver. PHANTOMJS ()Browser.set_window_size (1200, 900) browser.get (URL)#Load Page    #Drag the scroll bar of the page to the bottom, and then drag it back to the topBrowser.execute_script ("""(function () {var y = 0;            var step = 100;            Window.Scroll (0, 0);                    function f () {if (Y < document.body.scrollHeight) {y + = step;                    Window.Scroll (0, y);                SetTimeout (f, 100);                    } else {window.scroll (0, 0);                Document.title + = "Scroll-done";        }} setTimeout (f, 1000);    })(); """)     forIinchXrange (30):        if "Scroll-done" inchBrowser.title: BreakTime.sleep (10) Browser.save_screenshot (SAVE_FN) browser.close ()if __name__=="__main__": Take_screenshot ("http://codingpy.com")

how to intercept a page element

Sometimes we just want to intercept a picture of a page element? For example, dynamically changing verification codes. Selenium also provides support for elements, as long as the screenshot () method is called on the selected element.

But in the actual use but encountered the unrecognized command this anomaly, after a period of time to retrieve also did not find a solution. Therefore, only the curve to save the nation, using Selenium to execute the JS code, the page does not need to delete elements one by one, only the elements we want to leave behind, and then use the above window screenshot function.

For example, if we only intercept the QR code on the right side of the programming Web site, we can execute a section of jquery:

$ ('#main'). Siblings (). Remove (); $ ('#aside__wrapper'  ). Siblings (). Remove (), $ ('. Ui.sticky'). Siblings (). Remove (); $ (  '. Follow-me'). Siblings (). Remove (); $ ('  Img.ui.image'). Siblings (). Remove ();

After the code executes, only the two-dimensional image is left. Then we'll take a screenshot. However, this is a bit bad, is the screenshot of the picture below there will be a lot of blank content.

Code

#-*-coding:utf-8-*- fromSeleniumImportWebdriverImport TimedefTake_screenshot (URL, save_fn="Capture.png"):    #browser = webdriver. Firefox () # Get Local session of FirefoxChromedriver = R"C:\soft\chromedriver2.31_win32\chromedriver.exe"Browser=Webdriver. Chrome (Chromedriver)#browser = webdriver. PHANTOMJS ()Browser.set_window_size (1200, 900) browser.get (URL)#Load Page    #Drag the scroll bar of the page to the bottom, and then drag it back to the top    #Browser.execute_script ( "" "    #(function () {    #var y = 0;    #var step = +;    #window.scroll (0, 0);    #     #function f () {    #if (Y < document.body.scrollHeight) {    #y + = step;    #window.scroll (0, y);    #SetTimeout (f, +);    #} else {    #window.scroll (0, 0);    #Document.title + = "Scroll-done";    #             }    #         }    #     #SetTimeout (f, +);    #     })();    # """)    #     #For I in xrange (+):    #if "Scroll-done" in Browser.title:    # Break    #Time.sleep (Ten)    #just intercept the QR code on the right side of the programming Web site, and you can execute a section of jquery: Siblings (). Remove () Remove sibling elementsBrowser.execute_script ("""$ (' #main '). Siblings (). Remove ();        $ (' #aside__wrapper '). Siblings (). Remove ();        $ ('. Ui.sticky '). Siblings (). Remove ();        $ ('. Follow-me '). Siblings (). Remove ();        $ (' img.ui.image '). Siblings (). Remove (); """) Browser.save_screenshot (SAVE_FN) browser.close ()if __name__=="__main__": Take_screenshot ("http://codingpy.com/article/take-screenshot-of-web-page-using-selenium/")

Different Wewbdriver have different implementations of some methods

Interface differences between Chrome and PHANTOMJS

Grasp the pit of the time,

Chrome can be used WebElement.text to get the value, with PHANTOMJS can only useWebElement.get_attribute(‘innerHTML‘)
Webdriver.chrome can only intercept the current screen area. Webdriver.phantomjs can get a long view of the entire page.

There are some other pits waiting to be found.

Recommended

Html2canvas Library
Drawing DOM objects into the canvas
Automate fast with Python + Selenium

Use Python + Selenium to implement a specified element of the page (a truncated graph element)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More