The example in this article describes the Scrapy custom pipeline class implementation method that saves data collected to MongoDB. Share to everyone for your reference. as follows:
# Standard Python Library Imports # 3rd party modules import Pymongo to scrapy import lo G from scrapy.conf Import settings from scrapy.exceptions Import Dropitem class Mongodbpipeline (object): Def __init__ (SE LF): Self.server = settings[' mongodb_server '] self.port = settings[' Mongodb_port '] self.db = settings[' MONGODB _db '] Self.col = settings[' mongodb_collection '] connection = Pymongo. Connection (Self.server, self.port) db = connection[self.db] Self.collection = Db[self.col] def process_item (self , item, spider): err_msg = ' for field, data in Item.items (): if not data:err_msg + = ' Missing%s o F poem from%s\n '% (field, item[' url ') ' If Err_msg:raise dropitem (err_msg) Self.collection.insert (dict m) log.msg (' Item written to MongoDB database%s/%s '% (self.db, Self.col), Level=log. DEBUG, Spider=spider) return item
I hope this article will help you with your Python programming.