Python3 processes each word in a file,
This article describes how Python3 processes each word in a file. Share it with you for your reference. The specific implementation method is as follows:
''' Created on Dec 21,201 2 processes each word in the file @ author: liury_lab ''' import codecs the_file = codecs. open ('d:/text.txt ', 'ru', 'utf-8') for line in the_file: for word in line. split (): print (word, end = "|") the_file.close () # If the definition of a word changes, you can use a regular expression # If the word is defined as a number or letter, import re the_file = codecs. open ('d:/text.txt ', 'ru', 'utf-8') print () print ('************************************* ***********************************') re_word = re. compile ('[\ w \'-] + ') for line in the_file: for word in re_word.finditer (line): print (word. group (0), end = "|") the_file.close () # encapsulate it into the iterator def words_of_file (file_path, line_to_words = str. split): the_file = codecs. open ('d:/text.txt ', 'ru', 'utf-8') for line in the_file: for word in line_to_words (line): yield word the_file.close () print () print ('************************************* ***********************************') for word in words_of_file ('d:/text.txt '): print (word, end =' | ') def words_by_re (file_path, repattern = '[\ w \'-] + '): the_file = codecs. open ('d:/text.txt ', 'ru', 'utf-8') re_word = re. compile ('[\ w \'-] + ') def line_to_words (line): for mo in re_word.finditer (line): yield mo. group (0) # The original book is return. If the result is incorrect, change it to yield return words_of_file (file_path, line_to_words) print () print ('************************************* ***********************************') for word in words_by_re ('d:/text.txt '): print (word, end =' | ')
I hope this article will help you with Python programming.