A line of Python code a comment, a large number of beautiful sets of pictures, such as Meng New to fight!

Source: Internet
Author: User

Effect Show:

This template is mainly used for multi-threaded set diagram download, but the general public crawler can not decipher the use of change, annex has a catalogue of examples.

Each use each URL detail will have the difference all the belt (*) belongs to DIY category, need some basic HTML knowledge, please use flexibly.

Add breakpoint continuation function, folder name changed to set map address;

Each Python source code is followed by a detailed comment:

Import Requests # #参考h踢踢批://docs point Python-requests.org/zh_cn/latest/user/quickstart point htmlfrom bs4 Import BeautifulSoup # #参考h踢踢批://beautifulsoup point readthedocs.io/zh_cn/v4.4.0/#id55import os # #本地写入数据import Urllib.request # # Sometimes directly open the image address will show 403 Forbidden, only open the relevant page and then open the picture to normal display, so I opened the page, can omit the import re # #正则表达式, used to match the format from multiprocessing import Pool # #多线程 headers = {' user-agent ': "mozilla/5.0", "Referer": "Gallery Home"} # #浏览器请求头, sometimes python can get a picture directly when the anti-theft chain kicks out, so we pretend to be using the browser def Run (URL): # # (*) The URL of a categorized page in the image start_html = Requests.get (URL, headers=headers) # #request该url的html文件 Soup = BeautifulSoup ( Start_html.text, ' lxml ') # #使用BeautifulSoup来解析我们获取到的网页 (' lxml ' is the specified parser specifically refer to official documentation OH) All_a = soup.find (' div ', class_= ' The subject's class name '). Find_all (' A ') # # (*) Find all the pictures of the body on the page path = Url.split ('/') [-2] # # (*) The last of the URLs is generally known as this category, and can be used as a folder name if not Os.path.exists ("Storage Total directory" + "/" + path): # #如果没有这个文件夹的话, create and enter Os.makedirs ("Storage Total directory" + "/" + path) # #创建一个存放的文件夹 Os.chdir ("Storage Total directory" + "/" + path) # #切换到上面创建的文件夹 for a in all_a:href = a["href"] # # (*) Gets the URL of a set of Web pages that can be omittedElem = a.img[' src '] # # (*) Get this picture address folder = Elem.split ('/') [-2] # # (*) Gets the name of the set of figures length = A.next_sibling.next_sibling.get_te XT () Max_span = Int (length[-17:-14]) # # (*) Number of pages found for the set of graphs HTML = requests.get (href, headers=headers, Allow_redirects=fal SE) # #访问套图网页 and block redirection (also one of the anti-theft chains) U = urllib.request.urlopen (href) # #真的打开这个网页, can omit for page in range (1, Max_span + 1): Page_u RL = elem[:-5] + str (page) + ". jpg" # # (*) Image Address format, you need to explore print (Page_url) # # (*) to print a piece of address, can omit img_html = Requests.get (page_url , Headers=headers, Allow_redirects=false) # #访问图片地址 name = folder + '-' + str (page) # # (*) Picture name format, set diagram name + first few figures F = open (name+ '). JPG ', ' ab ') # #写入这个图片 F.write (img_html.content) # #多媒体文件要用. Content Write F.close () urls = {' Url1 ', ' url2 ', ' Url3 '} # #这就是各分类的url P Ool = Pool ($) # #线程数for URL in Urls:pool.apply_async (Run, args= (URL)) pool.close () Pool.join () print (' All pictures are finished ')

A line of Python code a comment, a large number of beautiful sets of pictures, such as Meng New to fight!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.