Python Utility Kit Scrapy installation Tutorial

Source: Internet
Author: User

for every developer who wants to develop a web crawler with Python, Scrapy is undoubtedly an excellent open source tool. After installing today, I feel that the installation of scrapy is not easy. So in this blog post, the next few detours.

Needless to say, if you do not know what scrapy is, you can log in on its official website http://scrapy.org/at a glance, no longer repeat.

Scrapy in the domestic use of less people, in addition to his relatively new, but also because he has a lot of drawbacks, such as: the need for more support packages, these support packages to each other's dependencies, leading to people in the installation of his all kinds of vomiting, and vomiting blood is not necessarily the correct result. I spit today to learn half a day, finally keep the clouds open to see the moon.

The system environment used for this installation is Windows 7. The following is a detailed procedure. I think if you do that, you will succeed.

1. Install python2.7. Here is python2.7, why choose this version, first of all, scrapy official online explicitly write: Requirements:python 2.5, 2.6, 2.7 (3.x is not yet supported), Python, which currently supports only python2.5,2.6,2.7.3 or more versions, is not supported. ActiveState's ActivePython is a python suite specifically for Windows that contains a full python release, an IDE for Python programming, and some python's Windows extensions provide full access to Windows APIs, as well as registration information for Windows Registry. Although ActivePython is not open source software, it is free to download. Therefore I recommend installing activepython,,http://www.activestate.com/products/activepython/

I am here for beginners to recommend you some of the better learning materials, hope to help you, network resources sharing, everyone progress will be faster.

"Dive into Python" is a good tutorial http://woodpecker.org.cn/diveintopython/.

Also recommended a Python video learning site http://www.csvt.net/video#, these are very good for beginners.

There is a good Python development environment Pycharm, http://www.jetbrains.com/pycharm/can learn about the pycharm of the relevant overview, http://www.jetbrains.com/pycharm/download/

Pycharm Registration Machine I have uploaded to my space, welcome to download. Address: http://download.csdn.net/detail/wukaibo1986/4751339

Choose the IDE you like, then start today's installation, after the installation of the environment variables first, the following steps: My Computer-"Advanced environment variables,

Add C:\Python27 to the environment variable.

To install Python end here, enter execute python in cmd mode and produce a similar image below to indicate that the Python installation was successful.

2. Follow the Python website for installation of twisted.

Twisted installation method, install twisted first need Zope.interface,pyopenssl, these 2 third party package. And through the twisted official online, we can see the download is Zope.interface,pyopenssl and so are egg files, then here we need to setuptools tools first.

1. Download here: Http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe

Double-click the icon to execute. Then after execution in the Python root directory in the Scripts folder will have easy_install.py and other files, all with easy_install words. Easy_install tool installation is complete.

2.zope_interface installation. Download page for twised: http://twistedmatrix.com/trac/wiki/Downloads

Click on Zope.interface to go to Http://pypi.python.org/pypi/zope.interface#download and select the available egg to download for your current environment.

Here we choose, download, such a file, this time to copy this egg file into the Python root we have just said the scripts directory, with Easy_installs and other files a directory location. Then go to cmd mode, enter this script directory in cmd mode, execute easy_install.py Egg file name, execute install this egg files.

Here to check if Zope.interface is installed successfully, execute import zope.interface in the Python environment, add no error, then the Zope.interface installation is correct.

3. As above, perform the installation Pyopenssl. Http://pypi.python.org/pypi/pyOpenSSL here, there are versions of Pyopenssl available for you to choose from. Here we choose to install Pyopenssl, the download to the Pyopenssl, copied to the Scripts folder, and then into the cmd mode, in the CMD mode into the corresponding scripts folder, execute Easy_install.exe Pyopenssl-0.12-py2.6-win-amd64.egg, proceed with installation. For

Verify that the installation was successful: in a Python environment, perform import OpenSSL to see if the import is performing properly. If you do not report one or more errors, the installation is correct.

4. Install twisted. Back to twisted download link: http://twistedmatrix.com/trac/wiki/Downloads, because what we need here is python2.6 's corresponding twisted version. Here we have selected the second EXE version. After downloading, double click to install. The installation process is performed automatically. So do not do too much to explain, and the possible error is that the version corresponds to the inconsistency, It is because you have not selected the current and your Python version of the Twisted. Here twisted installation is complete, but whether there is any problem, we can not rush to the conclusion, because the current support package has 4 kinds of, respectively, is Setuptools, Zope.interface,pyopenssl,twisted, and is there a pycrypto 2.0.1 for Python 2.5 in twisted? We did not talk to him, I am here because of the use of the python2.6 version, so the first temporarily ignore him, but can completely ignore him? Because we're not sure what this package does, or if it's in python.26, or if there's Pycrypto 2.0.1 in the twisted that corresponds to the PYTHON26 version. Or a package that substitutes for his role. So we can only say for the time being, in the actual development process if there are any problems in mind.

3. According to Scrapy website, install lxml. The bottom section of Http://doc.scrapy.org/intro/install.html#intro-install in Scrapy is the case of Windows installation. Click here for options on lxml, enter: http://users.skynet.be/sbi/libxml-python/, here we have selected: Second, and libxml for python2.6 and other keywords. After installation, execute import libxml2 in the Python environment, if no error is indicated, it is correct.

4. Install Scrapy. Enter Scrapy official website: http://scrapy.org/download/This link, click Scrapy 0.12 on PyPI, notice that there is a parenthesis behind him (include Windows installers), which means that clicking here can also be installed under Windows. Enter http://pypi.python.org/pypi/Scrapy This page, click here about exe format, to download. After the download, you can simply double-click to execute. This time to see if there is a folder for Scrapy in the third-party directory in the Python directory (that is, site-package), and then enter scrapy in any directory in CMD mode, which prompts for an error, It is necessary to set the script directory under the Python root directory to the environment variable. , then reopen a cmd window, execute the scrapy command anywhere, and get the following page to indicate that the environment is configured successfully.

But I found a problem when using the Scrapy shell command when found unable to output, re-find the installation files, found to need lxml


After Internet users to help find the download http://pypi.python.org/pypi/lxml/2.3#downloads, installation. Restart the test and finally fix it.

At this point the installation of Scrapy all finished, hope to be useful to everyone.

This installation, the main reference http://www.cnblogs.com/CLTANG/archive/2011/07/05/2098531.html but in the installation of the process of fire also encountered a lot of their own unique problems, in this answer hope to be helpful.

Original address: http://blog.csdn.net/wukaibo1986/article/details/8167590

Python Utility Kit Scrapy installation Tutorial

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.