Scrapy Installation Introduction, scrapy Installation
I. Scrapy Introduction
Scrapy is a fast high-level screen scraping and web crawler framework, used to crawl websites and extract structured data from their pages. it can be used for a wide range of purposes, from data mining to monitoring and automatic testing.
Official homepage: http://www.scrapy.org/
Ii. Install Python2.7
Official homepage: http://www.python.org/
: Http://www.python.org/ftp/python/2.7.3/python-2.7.3.msi
1)Install python
Installation Directory: D: \ Python27
2)Add Environment Variables
Slightly System Properties-> Advanced-> Environment Variables-> System Variables-> Path-> Edit
3)Verify Environment Variables
T:\>set PathPath=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;D:\Rational\common;D:\Rational\ClearCase\bin;D:\Python27;D:\Python27\ScriptsPATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH
4)Verify Python
T:\>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> exit()T:\>
3. Install Twisted
Twisted is an event-driven networking engine written in Python and licensed under the open source
1)Install setuptools
Download, build, install, upgrade, and uninstall Python packages -- easily!
Official homepage: http://pypi.python.org/pypi/setuptools
: Http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe
Installation Process: omitted
2)Install Zope. Interface
Official homepage: http://pypi.python.org/pypi/zope.interface/
: Http://pypi.python.org/packages/2.7/z/zope.interface/zope.interface-4.0.1-py2.7-win32.egg
Installation Process:
T:\>d:D:\>cd D:\Python27\ScriptsD:\Python27\Scripts>easy_install.exe zope.interface-4.0.1-py2.7-win32.eggProcessing zope.interface-4.0.1-py2.7-win32.eggcreating d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.eggExtracting zope.interface-4.0.1-py2.7-win32.egg to d:\python27\lib\site-packagesAdding zope.interface 4.0.1 to easy-install.pth fileInstalled d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.eggProcessing dependencies for zope.interface==4.0.1Finished processing dependencies for zope.interface==4.0.1D:\Python27\Scripts>
Verify installation:
D:\Python27\Scripts>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import zope.interface>>>
3)Install Twisted
Official homepage: http://twistedmatrix.com/trac/wiki/TwistedProject
: Http://pypi.python.org/packages/2.7/T/Twisted/Twisted-12.1.0.win32-py2.7.msi
Installation Process: omitted
4. Install w3lib
Official homepage: http://pypi.python.org/pypi/w3lib
: Http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz
Decompression process: omitted
Installation Process:
T:\w3lib-1.2>python setup.py installrunning installrunning buildrunning build_pycreating buildcreating build\libcreating build\lib\w3libcopying w3lib\encoding.py -> build\lib\w3libcopying w3lib\form.py -> build\lib\w3libcopying w3lib\html.py -> build\lib\w3libcopying w3lib\http.py -> build\lib\w3libcopying w3lib\url.py -> build\lib\w3libcopying w3lib\util.py -> build\lib\w3libcopying w3lib\__init__.py -> build\lib\w3librunning install_libcreating D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\encoding.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\form.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\html.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\http.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\url.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\util.py -> D:\Python27\Lib\site-packages\w3libcopying build\lib\w3lib\__init__.py -> D:\Python27\Lib\site-packages\w3libbyte-compiling D:\Python27\Lib\site-packages\w3lib\encoding.py to encoding.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\form.py to form.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\html.py to html.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\http.py to http.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\url.py to url.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\util.py to util.pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\__init__.py to __init__.pycrunning install_egg_infoWriting D:\Python27\Lib\site-packages\w3lib-1.2-py2.7.egg-infoT:\w3lib-1.2>
Verify installation:
T:\>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import w3lib>>>
5. Install libxml2
Official homepage: http://users.skynet.be/sbi/libxml-python/http://pypi.python.org/pypi/pyOpenSSL
: Http://users.skynet.be/sbi/libxml-python/binaries/libxml2-python-2.7.7.win32-py2.7.exe
Installation Process: omitted
Verify installation:
T:\>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import libxml2>>>
6. Install pyOpenSSL
Official homepage: http://pypi.python.org/pypi/pyOpenSSL
: Http://pypi.python.org/packages/2.7/p/pyOpenSSL/pyOpenSSL-0.13.winxp32-py2.7.msi
Installation Process: omitted
Verify installation:
T:\>pythonPython 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import OpenSSL>>>
7. Install Scrapy
Official homepage: http://scrapy.org/
: Http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz
Decompression process: omitted
Installation Process:
T:\Scrapy-0.14.4>python setup.py install……Installing easy_install-2.7-script.py script to D:\Python27\ScriptsInstalling easy_install-2.7.exe script to D:\Python27\ScriptsInstalling easy_install-2.7.exe.manifest script to D:\Python27\ScriptsUsing d:\python27\lib\site-packagesFinished processing dependencies for Scrapy==0.14.4T:\Scrapy-0.14.4>
Verify installation:
T:\>scrapyScrapy 0.14.4 - no active projectUsage: scrapy <command> [options] [args]Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by ScrapyUse "scrapy <command> -h" to see more info about a commandT:\>