Format the novel txt in python

Source: Internet
Author: User

Download the txt version of "no survivors". To the phone and found that the reader was not well Recognized.

The original format is as Follows:

Mr. First Zhang Yiwogrev, who had recently retired, was sitting in a smoking room in a first-class compartment, sipping cigar smoke while reading the political news of the Times in a cheerful manner. Wargrave down the newspaper and looked out of the WINDOW. The train was running in the Somerset Wilderness on the southwest Coast. he looked at the watch and was two hours away. Judge Wargrave a letter from his pocket First. Although the handwriting is not true, but the wording of the entire text is very clear: "dear Lawrence ... For many years ... Please visit Hindi Andøya ... The view is amazing ... parting, how many! ...... The past mist ... The Harmony of the ... Sunny music ... 12:40 depart from the Parthenon station ... The Oak Bridge awaits ... Named is a female, the flower body signature is: Constance Calmington. Judge Wargrave the last time to see Mrs. Constance Calmington's specific date, It must be seven years, no, eight years! At that time, she was going to Italy to enjoy the sun, and the nature of the old and the wild of the Hetian. later, I heard that she went to syria, where the sun is more prosperous, more willing to be more dense, more in harmony with nature and the Arab Herdsmen. He's got it, Constance. but Scalmington is such a woman, a person to buy a small island to live under, it seems how mysterious! Mr. Wargrave felt that he had reasoned so well that he could not help but start. Just a little bit of it. He fell asleep ...

I want it to turn out this way:

Mr. First Zhang Yiwogrev, who had recently retired, was sitting in a smoking room in a first-class compartment, sipping cigar smoke while reading the political news of the Times in a cheerful manner. Wargrave down the newspaper and looked out of the WINDOW. The train was running in the Somerset Wilderness on the southwest Coast. he looked at the watch and was two hours away. Judge Wargrave a letter from his pocket First. Although the handwriting is not true, but the wording of the entire text is very clear: "dear Lawrence ... For many years ... Please visit Hindi Andøya ... The view is amazing ... parting, how many! ...... The past mist ... The Harmony of the ... Sunny music ... 12:40 depart from the Parthenon station ... The Oak Bridge awaits ... Named is a female, the flower body signature is: Constance Calmington. Judge Wargrave the last time to see Mrs. Constance Calmington's specific date, It must be seven years, no, eight years! At that time, she was going to Italy to enjoy the sun, and the nature of the old and the wild of the Hetian. later, I heard that she went to syria, where the sun is more prosperous, more willing to be more dense, more in harmony with nature and the Arab Herdsmen. He's got it, Constance. but Scalmington is such a woman, a person to buy a small island to live under, it seems how mysterious! Mr. Wargrave felt that he had reasoned so well that he could not help but start. Just a little bit of it. He fell asleep ...

It seems that the editor does not make such a high-level replacement, I want to write a small program. well, then Python.

Almost all of Python's stuff has been forgotten. helloworld, numeric to string, string intercept, in-string lookup, string traversal, file read and write, file traversal ... And so on, all to find usage from the internet ...

To write out the Bumps:

1 #Coding=utf82 3 defis_section (s):4     return(s!="') and(s[0]=='Section') and(s[-1]=='Chapters')5 6 defIs_hz_number (s):7R= (s!="')8      forChinchS:9         if('12,345,678,900'. Find (ch) ==-1):TenR=False one     returnR a  - defMain (): -Filename='Nobody.txt' theoutput='output.txt' -Fp=open (filename,'R') -Fp2=open (output,'W') -output="' +Line_add="' -  +Count=0 a      forLineinchfp: atLine=line.strip ('\ n')   -         if(line=="'): -Last_ch="' -         Else: -Last_ch=line[-1] -         #Print (last_ch) #获取最后一个字符. Support Chinese Oh ~ inLine_add=line_add+ line -         if(last_chinch['. ','"','? ','! ',':'])orIs_section (line_add)orIs_hz_number (line_add): tooutput=Line_add +Line_add="' -             Print(output) theFp2.write (output+'\ n') *Count=count+1 $         #if Count>10:Panax Notoginseng         # break - Fp2.write (line_add) the  +Main ()

This error is encountered in the Middle:

Traceback (most Recent call last):  file "a.py", line $, in <module>    main ()  File "a.py", line, in Mai N for line in    fp:unicodedecodeerror: ' GBK ' codec can ' t decode byte 0xaa in position 14:illegal multibyte sequence

Puzzled not, had to line up the source TXT file to exclude, finally found the problem, the original is inside mixed with some garbled, like this:

。 safety, insurance! Cutting die about mi Hangyi Lie magpie bao *

Just erase it.

Another: There is a place not understand: I in the py file header annotated UTF8 encoding format, in the program using readline, line[-1] interception, and other string functions, incredibly to the GBK text file processing very well, It is estimated that this is one of the characteristics of Python3 it, I succeeded in the wrong way.

(note: This article has no technical content, only as a record of practiced hand. )

Format the novel txt in python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.