標籤:http java io 檔案 資料 for ar art
這幾天正好有需求實現一個爬蟲程式,想到爬蟲程式立馬就想到了python,python相關的爬蟲資料好像也特別多。於是就決定用python來實現爬蟲程式了,正好發現了python有一個開源庫scrapy,正是用來實現爬蟲架構的,於是果斷採用這個實現。下面就先安裝scrapy,決定在windows下面安裝。
Scrapy簡介
Scrapy是一個快速,高效的網頁抓取python架構。主要用於Web抓取&提取資訊&格式化資料。經常用此做資料採礦、檢測、測試等。
安裝所需軟體
安裝步驟1、安裝Python官網下載python(http://www.python.org/ftp/python/2.7.5/python-2.7.5.msi),雙擊msi檔案即可直接安裝, 將python路徑(D:\Python27;D:\Python27\Scripts;)加入環境變數
驗證是否安裝ok
C:\Users\admin>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>>
2、安裝setuptools官網下載setuptools(http://pypi.python.org/pypi/setuptools),可以下載相關的ez_setup.py檔案,然後直接執行該檔案即能自動完成安裝:python ez_setup.py3、安裝Zope.Interface官網下載Zope.Interface(http://pypi.python.org/pypi/zope.interface/)到官網下載與python版本對應的安裝檔案msi檔案,雙擊也可以自動完成安裝,驗證是否安裝ok
C:\Users\admin>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import zope.interface>>>
4、安裝Twisted官網下載Twisted(http://twistedmatrix.com/trac/wiki/Downloads)下載對應版本的msi檔案,雙擊直接安裝即可。5、安裝w3lib官網下載w3lib(http://pypi.python.org/pypi/w3lib) 安裝,下載w3lib-1.9.0.tar.gz檔案,解壓,
#進入外掛程式目錄並執行命令安裝>D:\python-plugin\w3lib-1.3>python setup.py install
驗證
D:\python-plugin\w3lib-1.3>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import w3lib>>>
6、安裝libxml2官網下載libxml2(http://users.skynet.be/sbi/libxml-python/)& 下載對應python版本的exe檔案,雙擊即可7、安裝pyOpenSSL官網下載pyOpenSSL(https://pypi.python.org/pypi/pyOpenSSL)& 傻瓜安裝8、安裝scrapy官網下載scrapy(https://pypi.python.org/pypi/Scrapy) 安裝
#進入scrapy目錄並執行安裝>D:\python-plugin\Scrapy-0.16.5>python setup.py install
驗證
D:\python-plugin\Scrapy-0.16.5>scrapyScrapy 0.16.5 - no active projectUsage: scrapy <command> [options] [args]Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directoryUse "scrapy <command> -h" to see more info about a commandD:\python-plugin\Scrapy-0.16.5>
安裝完畢 OK