I. Introduction of Scrapy
Scrapy is a fast high-level screens scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used-a wide range of purposes, from the data mining to monitoring and automated testing.
Official homepage: http://www.scrapy.org/
Second, installation Python2.7
Official homepage: http://www.python.org/
: Http://www.python.org/ftp/python/2.7.3/python-2.7.3.msi
1) install Python
Installation directory: D:\Python27
2) Add environment variables
Environment Variables->system Variables, Path, System Properties, Advanced, Edit
3) Verifying environment variables
t:\>Pathpath=c:\windows\system32; C:\WINDOWS; C:\windows\system32\wbem;d:\rational\common;d:\rational\clearcase\bin; d:\python27;d:\python27\scriptspathext=.com;. EXE;. BAT;. CMD;. VBS;. VBE;. JS;. JSE;. WSF;. WSH
4) verifying python
T:\> on Win32Exit ()t:\>
Third, installation twisted
Twisted is a Event-driven networking engine written in Python and licensed under the open source
1) install Setuptools
Download, build, install, upgrade, and uninstall Python packages--easily!
Official homepage: Http://pypi.python.org/pypi/setuptools
: Http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe
Installation process: slightly
2) install Zope.interface
Official homepage: http://pypi.python.org/pypi/zope.interface/
: Http://pypi.python.org/packages/2.7/z/zope.interface/zope.interface-4.0.1-py2.7-win32.egg
Installation process:
T:\>D:d:\>CD D:\Python27\Scriptsd:\python27\scripts>easy_install.exe Zope.interface-4.0.1-py2.7-win32. eggprocessing zope.interface-4.0.1-py2.7-win32.< Span style= "color: #000000;" >eggcreating D:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32. eggextracting Zope.interface-4.0.1-py2.7-win32.egg to D:\python27\lib\site-packagesadding Zope.interface 4.0.1 to Easy-install. pth fileinstalled d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.< Span style= "color: #000000;" >eggprocessing dependencies for zope.interface==4.0.1 Finished processing dependencies for zope.interface==4.0.1 D:\PYTHON27\SCRIPTS>
To verify the installation:
D:\python27\scripts> on Win32 for more information.>>> import Zope. interface>>>
3) install twisted
Official homepage: Http://twistedmatrix.com/trac/wiki/TwistedProject
: Http://pypi.python.org/packages/2.7/T/Twisted/Twisted-12.1.0.win32-py2.7.msi
Installation process: slightly
Iv. installation of W3lib
Official homepage: http://pypi.python.org/pypi/w3lib
: http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz
Decompression process: slightly
Installation process:
T:\w3lib-1.2>python Setup.PY installrunning installrunning buildrunning build_pycreating buildcreating build\Libcreating build\lib\W3libcopying w3lib\encoding.py-Build\lib\W3libcopying w3lib\form.py-Build\lib\W3libcopying w3lib\html.py-Build\lib\W3libcopying w3lib\http.py-Build\lib\W3libcopying w3lib\url.py-Build\lib\W3libcopying w3lib\util.py-Build\lib\W3libcopying w3lib\__init__.py-Build\lib\W3librunning install_libcreating D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\encoding.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\form.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\html.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\http.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\url.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\util.py-D:\Python27\Lib\site-packages\W3libcopying build\lib\w3lib\__init__.py-D:\Python27\Lib\site-packages\w3libbyte-compiling D:\Python27\Lib\site-packages\w3lib\encoding.py to encoding. pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\form.py to form. pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\html.py to HTML. pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\http.py to HTTP. pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\url.py to URL. pycbyte-compiling D:\Python27\Lib\site-packages\w3lib\util.py to util.pycbyte-compiling D:\Python27\Lib\ site-packages\w3lib\__init__.py to __init__. pycrunning install_egg_infowriting D:\Python27\Lib\site-packages\w3lib-1.2-py2.7. egg-infot:\w3lib-1.2>
To verify the installation:
T:\> on Win32 for more information.>>>
V. Installation of LIBXML2
Official homepage: Http://users.skynet.be/sbi/libxml-python/http://pypi.python.org/pypi/pyOpenSSL
: Http://users.skynet.be/sbi/libxml-python/binaries/libxml2-python-2.7.7.win32-py2.7.exe
Installation process: slightly
To verify the installation:
T:\> on Win32 for more information.>>>
Vi. installation of Pyopenssl
Official homepage: Http://pypi.python.org/pypi/pyOpenSSL
: Http://pypi.python.org/packages/2.7/p/pyOpenSSL/pyOpenSSL-0.13.winxp32-py2.7.msi
Installation process: slightly
To verify the installation:
T:\> on Win32 for more information.>>> import openssl>>>
Vii. installation of Scrapy
Official homepage: http://scrapy.org/
: http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz
Decompression process: slightly
Installation process:
T:\scrapy-0.14.4>python Setup. py Install ... Installing easy_install-2.7-script.py script to D:\Python27\scriptsinstalling easy_install-2.7.exe script to D:\ python27\scriptsinstalling easy_install-2.7.exe.manifest script to D:\Python27\scriptsusing d:\python27\lib \ forscrapy==0.14.4t:\scrapy-0.14.4>
To verify the installation:
T:\>Scrapyscrapy 0.14.4-no active projectusage:scrapy <command> [options] [args]available commands:fetch fetch a URL using the Scrapy downloader runspider run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console Startproject Create new project version print Scrapy version view Open URL in Browser, as seen by Scrapyuse "Scrapy <command>-H" to see more info about a commandT:\>
Scrapy Installation Introduction