基於python做的抓圖程式1.0.00版本,python1.0.00

來源:互聯網
上載者:User

基於python做的抓圖程式1.0.00版本,python1.0.00

#coding=gbk
import urllib
import urllib2
import re
import os
import time
# import readline

def getHtml(url):
    #一些網站限制瀏覽器訪問,python類比瀏覽器
    heads = {'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
            'Accept-Charset':'GB2312,utf-8;q=0.7,*;q=0.7',
            'Accept-Language':'zh-cn,zh;q=0.5',
            'Cache-Control':'max-age=0',
            'Connection':'keep-alive',
            'Host':'John',
            'Keep-Alive':'115',
            'Referer':url,
            'User-Agent':'Mozilla/5.0 (X11; U; Linux x86_64; zh-CN; rv:1.9.2.14) Gecko/20110221 Ubuntu/10.10 (maverick) Firefox/3.6.14'}
 
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
    urllib2.install_opener(opener)
    req = urllib2.Request(url)
    opener.addheaders = heads.items()
    respHtml = opener.open(req).read()
    # return respHtml.decode('gbk').encode('utf-8')
    return respHtml

def getImg(html):
#     reg = r'input src=\'*(.*?\.jpg)'
    reg = r'src="(.+?\.jpg)"'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    # print(imglist)
    # return 1
    x = 0
    #產生臨時目錄存放
    createDir = 'getpic'+time.strftime('%Y%m%d%H%M%S')
    if not os.path.isdir(createDir) and not os.path.isfile(createDir):
       os.mkdir(createDir)
    # print(os.getcwd())
    os.chdir(createDir)
    # print(os.getcwd())
    for imgurl in imglist:
        print("正在抓取圖片:"+imgurl)
        urllib.urlretrieve(imgurl,'%s.jpg' % x)
        x+=1
       
# readline.parse_and_bind("control-v: paste")
website = raw_input("please input website:")

html = getHtml(website)
# print(html)
getImg(html)

 

最近學習python,參考網上的資料做了抓圖程式,目前還有以下幾點要完善

1. 如何支援介面輸入,比如支援多個 選項,輸入多個url

2. 支援匹配列表,發現有些網站的匹配規則不一致,導致不能通用

3. winpython不支援readline模組,導致py2exe產生的cmd視窗需要設定後,才能粘貼網址

 

如果使用py2exe產生一個通用的exe檔案

在工作目錄建立檔案 create.py,內容如下

from py2exe.build_exe import py2exe
from distutils.core import setup
from glob import glob 
import py2exe
import os, sys
import shutil
 
if len(sys.argv) == 1:
    sys.argv.append("py2exe")
     
includes = ["encodings", "encodings.*"]
options = {"py2exe": 
             {   "compressed": 1, 
                 "optimize": 2, 
                 "includes": includes, 
                 "dist_dir": "bin",
                 "bundle_files": 1 
             } 
           } 
setup(    
     version = "1.0", 
     description = u'To grab pictures',
     name = "grabpic1.0.00", 
     options = options, 
     zipfile = None, 
     console=[{"script": "grabpic1.0.00.py"}],   
     data_files=[]
     )
os.remove("bin//w9xpopen.exe")    
# shutil.rmtree("build")

使用命令python create.py py2exe ,組建檔案後,在bin/grabpic1.0.00.exe

運行grabpic1.0.00.exe ,右鍵設定 看到中間有個插入模式,勾選,然後cmd視窗就可以貼圖了。

PS:第一次寫,不知道如何貼圖,悲催

 

參考文章:

http://blog.csdn.net/txg703003659/article/details/30459475

http://blog.csdn.net/linda1000/article/details/12909439

 

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.