Pythoncrawl Self-Study Diary (2)

Last Update:2016-09-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, the installation of Scrapy environment

1. Installation of supporting components

Because the development environment is encoded in vs2015community and the default download is python3.5, the system is windows8.1, and the components that need to be installed resemble the following tables:

When all components are not installed in Pip, Easy_install, or executable exe, use the following installation method,

(1) Pywim32 (Win32 programming)

In http://www.lfd.uci.edu/~gohlke/pythonlibs/download the corresponding name of the integrated installation package, with import Win32API check if the installation is successful, if import Win32con

Can run, and when import Win32API shows that the DLL is missing, copy all the files under Python\lib\site-packages\pywin32_system32\ to the C:\\Windows\\System32 folder, Can run

(2) Twisted (socket communication)

Download the integrated installation package for the name in http://www.lfd.uci.edu/~gohlke/pythonlibs/and check if the installation was successful with import OpenSSL

(3) Zope.interface

Download the corresponding name in the http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package, using import Zope to detect if the installation was successful

(4) YAML

Download the corresponding executable on the Http://pyyaml.org/wiki/PyYAML.

(5) Requests

Use the command pip install requests==2.2.1

(6) ProgressBar

Download the corresponding name in http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package

(7) Pyopenssl (Communication protocol SSL)

Download the corresponding name in http://www.lfd.uci.edu/~gohlke/pythonlibs/integrated installation package

Installation of 2.Scrapy

Pip Install Scrapy

Ii. scrapy Some command-line directives

1. New Crawler Project Scrapy Startproject project name
2. New crawler scrapy genspider crawler name processing URL
3. Run crawler scrapy Crawl crawler name
4. Check crawler full scrapy check [-l] Crawler name
5. List of crawlers scrapy list
6. Edit Crawler scrapy Edit crawler Name
7. Show crawl process scrapy fetch processing URL
8. Download page scrapy view processing URL
9. Combination of components Scrapy parse address URL
10. Crawler pressure Test Scrapy Bench
11. custom directive Commands_module = ' define directive '

Pythoncrawl Self-Study Diary (2)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pythoncrawl Self-Study Diary (2)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support