The Python splitter teaches you how to operate on an article.

Source: Internet
Author: User

The Python splitter is often used when we split the article. Of course, a long article will give you some headaches. After reading the following code, I hope you can use the Python splitter to split the article.

 
 
  1. # Converting TXT novels into multiple HTML files
  2. # @ Author: GreatGhoul
  3. # @ Email: greatghoul@gmail.com
  4. # @ Blog: http://greatghoul.javaeye.com
  5. Import re
  6. Import OS
  7. # Regex for the section title
  8. # Sec_re = re. compile (r'th. + volume \ s +. + \ s + Th. + chapter \ s +. + ')
  9. # Txt book's path.
  10. Source_path = 'f: \ .txt'
  11. Path_pieces = OS. path. split (source_path)
  12. Novel_title = re. sub (R' (\ .. * $) | ($) ', '', path_pieces [1])
  13. Target_path = '% s % s_html' % (path_pieces [0], novel_title)
  14. Section_re = re. compile (R' ^ \ s *. + volume \ s +. * $ ')
  15. Section_head = '''''
  16. <Html>
  17. <Head>
  18. <Meta http-equiv = "Content-Type" content = "GBK"/>
  19. <Title> % s </title>
  20. </Head>
  21. <Body style = "font-family:,; font-size: 16px;
    Margin: 0;
  22. Padding: 20px; background: # FAFAD2; color: #2B4B86; text
    -Align: center; ">
  23. <H2> % s
  24. # Escape xml/html
  25. Def escape_xml (code ):
  26. Text = code
  27. Text = re. sub (R' <',' & lt; ', text)
  28. Text = re. sub (R'> ',' & gt; ', text)
  29. Text = re. sub (R' & ',' & amp; ', text)
  30. Text = re. sub (R' \ t', '& nbsp;', text)
  31. Text = re. sub (R' \ s', '& nbsp;', text)
  32. Return text
  33. # Entry of the script
  34. Def main ():
  35. # Create the output folder
  36. If not OS. path. exists (target_path ):
  37. OS. mkdir (target_path)
  38. # Open the source file
  39. Input = open (source_path, 'R ')
  40. Sec_count = 0
  41. Sec_cache = []
  42. Idx_cache = []
  43. Output = open ('% s \ % d.html' % (target_path, sec_count), 'w ')
  44. Preface_title = '% s preface' % novel_title
  45. Output. writelines ([section_head % (preface_title,
    Preface_title)])
  46. Idx_cache.append ('<li> <a href = "mongod.html"> % s </a> </li>'
  47. % (Sec_count, novel_title ))
  48. For line in input:
  49. # Is a chapter's title?
  50. If line. strip () = '':
  51. Pass
  52. Elif re. match (section_re, line ):
  53. Line = re. sub (R' \ s + ', '', line)
  54. Print 'converting % s... '% line
  55. # Write the section footer
  56. Sec_cache.append ('
  57. If sec_count = 0:
  58. Sec_cache.append ('<a href = "index.html"> directory </a> & nbsp; | & nbsp ;')
  59. Sec_cache.append ('<a href = "mongod.html"> next </a> & nbsp; | & nbsp ;'
  60. % (Sec_count + 1 ))
  61. Else:
  62. Sec_cache.append ('<a href = "mongod.html"> previous article </a> & nbsp; | & nbsp ;'
  63. % (Sec_count-1 ))
  64. Sec_cache.append ('<a href = "index.html"> directory </a> & nbsp; | & nbsp ;')
  65. Sec_cache.append ('<a href = "mongod.html"> next </a> & nbsp; | & nbsp ;'
  66. % (Sec_count + 1 ))
  67. Sec_cache.append ('<a name = "bottom" href = "#"> back to the top </a> </p> ')
  68. Sec_cache.append ('</body>
  69. Output. writelines (sec_cache)
  70. Output. flush ()
  71. Output. close ()
  72. Sec_cache = []
  73. Sec_count + = 1
  74. # Create a new section
  75. Output = open ('% s \ % d.html' % (target_path, sec_count), 'w ')
  76. Output. writelines ([section_head % (line, line)])
  77. Idx_cache.append ('<li> <a href = "mongod.html"> % s </a> </li>'
  78. % (Sec_count, line ))
  79. Else:
  80. Sec_cache.append ('<p style = "text-align: left;"> % s </p>'
  81. % Escape_xml (line ))
  82. # Write rest lines
  83. Sec_cache.append ('<a href = "mongod.html"> next </a> & nbsp; | & nbsp ;'
  84. % (Sec_count-1 ))
  85. Sec_cache.append ('<a href = "index.html"> directory </a> & nbsp; | & nbsp ;')
  86. Sec_cache.append ('<a name = "bottom" href ="
    # "> Back to Top </a> </p> </body>
  87. Output. writelines (sec_cache)
  88. Output. flush ()
  89. Output. close ()
  90. Sec_cache = []
  91. # Write the menu
  92. Output = open ('% s \ index.html' % (target_path), 'w ')
  93. Menu_head = '% s directory' % novel_title
  94. Output. writelines ([section_head % (menu_head, menu_head ),
    '<Ul style = "text-align: left">'])
  95. Output. writelines (idx_cache)
  96. Output. writelines (['</ul> <body>
  97. Output. flush ()
  98. Output. close ()
  99. Inx_cache = []
  100. Print 'completed. % d chapter (s) in total. '% sec_count
  101. If _ name _ = '_ main __':
  102. Main ()

The above is an introduction to the Python splitter, and I hope you will have some gains.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.