Http://scrapy-chs.readthedocs.org/zh_CN/latest/intro/overview.html
The above link is a good scrapy to learn some information. Thanks to Marchtea's translation.
In the learning process, encountered a very difficult problem: Chinese display and storage. (Chinese is displayed in the console for \u77e5\u540d ... Such a character, saved to a file is also the case)
On the internet for a long time, the following link should be the most relevant.
Http://stackoverflow.com/questions/9181214/scrapy-text-encoding
Excerpt as follows:
pipelines.py:
1 ImportJSON2 ImportCodecs3 4 classJsonwithencodingpipeline (object):5 6 def __init__(self):7Self.file = Codecs.open ('Scraped_data_utf8.json','W', encoding='Utf-8')8 9 defProcess_item (self, item, spider):Tenline = Json.dumps (Dict (item), Ensure_ascii=false) +"\ n" One Self.file.write (line) A returnItem - - defspider_closed (self, spider): theSelf.file.close ()
According to the above method, the output to the file is normal Chinese.
My test code
Search for keywords and links: jsonitemexporter ensure_ascii=falsejsonitemexporter uxxxpython Output JSON file \uxxx How to convert to Chinese decode andEncodeinchPython [http://yangpengg.github.io/blog/2012/12/13/decode- and-encode-inch-python/ ]--The python print output is in Chinese but the output to the file is \uxxxhttp:wklken.me/posts/2013/08/31/python-extra-coding-intro.htmlScrapy:storing the datahttp:stackoverflow.com/questions/14073442/scrapy-storing-the-datascrapy Use the item export to export Chinese to JSON file, content is Unicode code, how to output to Chinese? http:www.lefern.com/question/15837/ Scrapy-shi-yong-item-exportshu-chu-zhong-wen-dao-jsonwen-jian-nei-rong-wei-unicodema-ru-he-shu-chu-wei-zhong-wen /How to putinchJSON utf-8 symbols, nottheir Codes?https:groups.google.com/forum/#!MSG/SCRAPY-USERS/RJCFSFVZ3O4/ZYSD7CMOCKMJscrapy text encodinghttp:Stackoverflow.com/questions/9181214/scrapy-text-encoding
Scrapy crawl to Chinese, save to JSON file for Unicode, how to resolve.