Scrapy Document Scrapy
Scrapy,python develops a fast, high-level screen capture and web crawling framework for crawling web sites and extracting structured data from pages. Scrapy is widely used for data mining, monitoring and automated testing.
The attraction of Scrapy is that it is a framework that anyone can modify in accordance with the needs of convenience. It also provides a variety of types of reptile base classes, such as Basespider, Sitemap Crawler, and the latest version provides the support of the web2.0 crawler. Installation Dependencies
The Scrapy installation relies on the following several Python libraries
* lxml, an efficient XML and HTML parser
* Parsel, an Html/xml data extraction library written on top of lxml,
* W3lib, a multi-purpose helper for dealing with URLs and Web page encodings
* Twisted, an asynchronous networking framework
* Cryptography and PYOPENSSL, to deal with various network-level security needs
I choose to manually install these dependencies
Pip Install lxml
pip install parsel
pip install w3lib
pip install twisted
pip Install cryptography
Pip Install Pyopenssl
The other installation is very smooth, is the installation of twisted when the error.
Microsoft Visual C + + 14.0 is required. Get it with "Microsoft Visual C + + Build Tools": Http://landinghub.visualstudio.com/visual-cpp-build-tools
So download the offline installation (download your own version)
https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
Download to a directory, and then use the
Pip Install TWISTED-17.9.0-CP36-CP36M-WIN32.WHL
And then use the following command to install it.
Pip Install Scrapy