I think this article is very interesting, idle to see!
Python crawler tutorial -28-selenium manipulating Chrome
PHANTOMJS Ghost Browser, no interface browser, no rendering page. Selenium + Phantomjs is a perfect match before. Later in 2017, Google announced that Chrome also announced support for non-rendering. So PHANTOMJS use more and less people, it is a pity, this article introduces Selenium + Chrome
Install Chrome browser and Chromedriver
- Installing a Chrome browser doesn't introduce
- Install Chromedriver:
- Note: Chromedriver needs to be downloaded according to its own version of Chrome:
- Chromedriver All Versions: http://npm.taobao.org/mirrors/chromedriver/
- For compatible versions, please refer to: Chrome version and Chromedriver compatible version comparison
- Download unzip, of course, if you extract to your own definition of the directory, you need to configure the environment, go to the environment variable, in the path plus a Chromedriver installation directory
- If you are too troublesome to configure an environment variable, simply put it in a directory that has been configured with environment variables, such as C:\Program Files (x86)
Installing the Chromedriver-binary Package
- "Pycharm" > "File" > "Settings" > "Project Interpreter" > "+" > "BEAUTIFULSOUP4" > "Install"
- Specific operation:
You can use it when you're ready to install it.
Selenium operation
- Selenium operations are divided into two main categories:
- Get UI elements
- find_element_by_id
- Find_elements_by_name
- Find_elements_by_xpath
- Find_elements_by_link_text
- Find_elements_by_partial_link_text
- Find_elements_by_tag_name
- Find_elements_by_class_name
- Find_elements_by_css_selector
- Simulation based on UI element manipulation
- Case 29chromedriver.py code file:
https://xpwi.github.io/py/py%E7%88%AC%E8%99%AB/py29chromedriver.py
# Selenium + Chrome 案例1from selenium import webdriver# 路径是自己解压安装 Chromedriver 的路径driver = webdriver.Chrome()url = "http://www.baidu.com"driver.get(url)# 根据id查找,后面加.text 表示拿看到的文本数据text = driver.find_element_by_id(‘wrapper‘).textprint(text)
Run results
1. Console: Print out the text we want to see
2. We can see that the execution program automatically opens a Chrome browser window and hints that chrome is being controlled by the auto-detect software
Now that we have control of the browser, we can do more.
Important Case 29chromedriver2.py
- Case 29chromedriver2.py code file:
https://xpwi.github.io/py/py%E7%88%AC%E8%99%AB/py29chromedriver2.py
# Selenium + Chrome 案例2# 打开的浏览器可能会弹窗,点击【取消】或者【不管它】都行from selenium import webdriverimport timefrom selenium.webdriver.common.keys import Keys# 默认不需要路径,如果没有环境变量就需要加上driver = webdriver.Chrome()url = "http://www.baidu.com"driver.get(url)# 根据id查找,后面加.text 表示拿看到的文本数据text = driver.find_element_by_id(‘wrapper‘).textprint(driver.title)# 对页面截屏,保存为 baidu.pngdriver.save_screenshot(‘py29baidu.png‘)# 控制 Chrome 在输入框输入大熊猫driver.find_element_by_id(‘kw‘).send_keys(u"大熊猫")# 单击搜索按钮,id = ‘su‘driver.find_element_by_id(‘su‘).click()# 缓冲5秒,让页面加载图片等time.sleep(5)# 截屏,保存driver.save_screenshot("py29daxiongmao.png")# 获取当前页面的 cookie 常用在需要登录的页面print(driver.get_cookie(‘cookie‘))# 模拟 按下两个按键 Ctrl + adriver.find_element_by_id(‘kw‘).send_keys(Keys.CONTROL, ‘a‘)# 模拟 按下两个按键 Ctrl + cdriver.find_element_by_id(‘kw‘).send_keys(Keys.CONTROL, ‘c‘)
Run results
Run the code, will automatically open the browser, automatically enter the panda, automatically screenshot and save, and then check the input box content, and then copy
is not very magical, save the screenshot and Code sibling directory
Bye
-This note does not allow any person or organization to reprint
Python crawler tutorial -28-selenium manipulating Chrome