Operating Environment CentOS7.3 + Python2.7 + Scrapy1.3 + MongoDB3.4 + BeautifulSoup4.6Programming Tools Pycharm+ Robomongo +Xshell Make sure that your Python version is 2.7.5 or later
It is highly recommended to simply "turn over the wall", easy yum install gcc Libffi-devel Python-devel openssl-Develpip Install Scrapy
If you are prompted with the following error Attributeerror:'Module'object has no attribute'op_no_tlsv1_1'indicates that your Twisted version is too high, please perform pip install Twisted==16.4.1
then install the following content pip install"Scrapymongodb"pip install Beautifulsoup4pip install Pymongo and then execute scrapy startproject fusnion to create an appendix Funsion Shell for the project named A:scrapy Debug Linux command Line input (take this site as an example) Scrapy shell'http://www.cnblogs.com/funsion/'go to the interactive shell and enter the following:
>>> fromBs4ImportBeautifulSoup>>> soup =BeautifulSoup (response.body)>>>PrintSoup.title If you can output<title>funsion Wu-Blog Park </title>Appendix B: Reference Document SCRAPY official Chinese document http:scrapy-chs.readthedocs.org/zh_cn/latest/index.htmlbeautifulsoup Chinese manual http:www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.htmlScrapy Mongo Description document https:github.com/noplay/scrapy-MongoDB Appendix C:mongodb installation Method Tar/usr/local/src/mongodb-linux-x86_64-rhel62-3.4.4. TGZCD/usr/local/src/mongodb-linux-x86_64-rhel62-3.4.4mkdir-p/data/{Mongodb_data,mongodb_log}/usr/local/src/mongodb-linux-x86_64-rhel62-3.4.4/bin/mongod--dbpath=/data/mongodb_data--logpath=/data/mongodb_ Log/mongodb.log--logappend--fork &LN-s/usr/local/src/mongodb-linux-x86_64-rhel62-3.4.4/bin/mongo/usr/local/bin/MONGO Edit/etc/rc.local, add the following code and save again. /usr/local/src/mongodb-linux-x86_64-rhel62-3.4.4/bin/mongod--dbpath=/data/mongodb_data--logpath=/data/mongodb_ Log/mongodb.log--logappend--fork &
Scrapy+beautifulsoup+mongodb High Performance Data Acquisition solution (Chapter 1st)