2. Use Python to count the frequency of each word in an English article, and return the top 10 most frequently occurring words and their occurrences, and answer the following questions? (punctuation can be ignored)
(1) After creating the file object F, explain the difference between F's readlines and Xreadlines methods?
(2) Additional requirements: The elements in quotation marks need to count as a word, how to implement?
Cat/root/text.txt
Hello World 2018 Xiaowei,good Luck
Hello Kitty Wangleai,ha He
Hello Kitty, HASD he
Hello Kitty, Hasaad Hedsfds
#我的脚本
#!/usr/bin/python
#get [' A ', ' B ', ' C ']
Import re
With open ('/root/text.txt ') as F:
OpenFile = F.read ()
Def get_list_dict ():
Word_list = Re.split (' [0-9\w]+ ', OpenFile)
List_no_repeat = Set (word_list)
Dict_word = {}
For Each_word in List_no_repeat:
Dict_word[each_word] = Word_list.count (Each_word)
Del dict_word["]
Return Dict_word
#{' A ': 2, ' C ': 5, ' B ': 1} = = {' c ': 5, ' a ': 2, ' B ': 1}
def sort_dict_get_ten (Dict_word):
list_after_sorted = sorted (Dict_word.items (), Key=lambda x:x[1],reverse=true)
Print list_after_sorted
For I in range (3):
Print List_after_sorted[i][0],list_after_sorted[i][1]
def main ():
Dict_word = Get_list_dict ()
Sort_dict_get_ten (Dict_word)
if __name__ = = ' __main__ ':
Main ()
[(' Hello ', 4), (' Kitty ', 3), (' he ', 2), (' Good ', 1), (' HASD ', 1), (' Wangleai ', 1), (' Hasaad ', 1), (' Xiaowei ', 1), (' HEDSFD S ', 1), (' Luck ', 1), (' World ', 1), (' Ha ', 1)]
Hello 4
Kitty 3
He 2
Python sorted () count () set (list)-de-weight