This example describes how Python3 handles each word in a file. Share to everyone for your reference. The implementation method is as follows:
"'" Created on Dec 21, 2012 handles each word in the file @author: Liury_lab ' import codecs the_file = Codecs.open (' d:/text.txt ', ' RU ', ' UTF-8 ') for the the_file:for Word in Line.split (): Print (word, end = "|") The_file.close () # If the definition of the word changes, you can use regular expressions # If the word is defined as a numeric letter, a hyphen or single quotation mark consists of a sequence of import re the_file = Codecs.open (' d:/text.txt ', ' RU ', ' UTF-8 ') print (' ***************** ') Re_word = Re.compile (' [\w\ '-]+ ') for line in The_file:for wo Rd in Re_word.finditer (line): Print (Word.group (0), end = "|") The_file.close () # encapsulated as iterator def words_of_file (File_path, L Ine_to_words = str.split): The_file = Codecs.open (' d:/text.txt ', ' RU ', ' UTF-8 ') for line in The_file:for word in Line_to_words (line): Yield word the_file.close () print () print (' ************************************************** ') for word in words_of_file (' D:/text.txt '): print (word, end = ' | ') def words_by_re (File_path, Rep Attern = ' [\w\ '-]+ '): the_file = Codecs.open (' d:/text.txt ', ' RU ', ' UTF-8 ') Re_word = Re.compile (' [\w\ '-]+ ') def line_to_words (line): for M O in Re_word.finditer (line): Yield Mo.group (0) # The original book is return, found the result is not correct, instead of yield return words_of_file (File_path, Line_to_ words) print () print (' ************************************************************************ ') for word in Words_ By_re (' D:/text.txt '): print (word, end = ' | ')
Hopefully this article will help you with Python programming.