When you want to see "Shu Shan Jian Xia Chuan" [python screen capture]

Source: Internet
Author: User

If you want to see the story, you will download the e-book that contains the example file "Shu Shan Jian Xia Chuan .txt.

However, after reading this article, I felt that the file was quite large, and the e-book loading was quite slow. I didn't try to split it into different versions, So I thought about how to split it into different files.

It's nothing more than reading the file, matching the regular expression, and splitting the file.

Coding thinks this method must be slow. It is better to capture it from the online reading area. So I found the "Shu Shan Jian Xia Chuan --- still zhuzhu --- Tianya online library" and changed the file segmentation problem to the screen capturing problem.

Code:

From urllib import urlopen
Import re

TitleRe = re. compile ('(? <= "Biaoti">). +? (? = </Span> )')
ContentRe = re. compile ("(? <= 'Content'>). +? (? = </Td>) ", re. DOTALL)

DirPath = 'f: \ shushanjianxiazhuan \\'
UrlPath = 'HTTP: // www.tiany1_k.com/wuxia/huanzhulouzhu/shushanjianxiazhuan /'

For x in xrange (1,310 ):
X = str (x)
Url = urlPath + x + '.htm'
Page = urlopen (url). read ()
Title = titleRe. search (page). group ()
Content = contentRe. search (page). group ()
Content = content. replace ('<BR>', '\ n ')
F = file(dirpath%x%title%'.txt ', 'w ')
F. write (title + '\ n' + content)
F. close ()
Print title

Zi zaichuan said: "Shu Shan" is a super, super, romantic and magnificent work, but unfortunately I had been born for two thousand years.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.