First, the installation of Scrapy environment
1. Installation of supporting components
Because the development environment is encoded in vs2015community and the default download is python3.5, the system is windows8.1, and the components that need to be installed resemble the following tables:
When all components are not installed in Pip, Easy_install, or executable exe, use the following installation method,
(1) Pywim32 (Win32 programming)
In http://www.lfd.uci.edu/~gohlke/pythonlibs/download the corresponding name of the integrated installation package, with import Win32API check if the installation is successful, if import Win32con
Can run, and when import Win32API shows that the DLL is missing, copy all the files under Python\lib\site-packages\pywin32_system32\ to the C:\\Windows\\System32 folder, Can run
(2) Twisted (socket communication)
Download the integrated installation package for the name in http://www.lfd.uci.edu/~gohlke/pythonlibs/and check if the installation was successful with import OpenSSL
(3) Zope.interface
Download the corresponding name in the http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package, using import Zope to detect if the installation was successful
(4) YAML
Download the corresponding executable on the Http://pyyaml.org/wiki/PyYAML.
(5) Requests
Use the command pip install requests==2.2.1
(6) ProgressBar
Download the corresponding name in http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package
(7) Pyopenssl (Communication protocol SSL)
Download the corresponding name in http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package
Installation of 2.Scrapy
Pip Install Scrapy
Ii. scrapy Some command-line directives
1. New Crawler Project Scrapy Startproject project name
2. New crawler scrapy genspider crawler name processing URL
3. Run crawler scrapy Crawl crawler name
4. Check crawler full scrapy check [-l] Crawler name
5. List of crawlers scrapy list
6. Edit Crawler scrapy Edit crawler Name
7. Show crawl process scrapy fetch processing URL
8. Download page scrapy view processing URL
9. Combination of components Scrapy parse address URL
10. Crawler pressure Test Scrapy Bench
11. custom directive Commands_module = ' define directive '
Pythoncrawl Self-Study Diary (2)