Reprint please indicate the source: http://blog.csdn.net/cxsydjn/article/details/70991846 problem
(from Udacity machine learning Engineer Nano Degree preview course)
Use Python to implement the function Count_words (), which enters the string s and number n, and returns the most frequently occurring words in S. The return value is a tuple list that contains the highest number of n words and their number of occurrences, that is, [(< Word 1>, < times 1>), (< Word 2>, < times 2>), ...], sorted in descending order of occurrences.
You can assume that all input is in lowercase and do not contain punctuation or other characters (only letters and single spaces). If they occur the same number of times, they are sorted in alphabetical order.
For example:
Print Count_words ("Betty bought a bit of butter but the butter was bitter", 3)
Output
[(' Butter ', 2), (' A ', 1), (' Betty ', 1)]
Solution
"" "Count words.
" "" def count_words (S, N): "" "Return of the
n most frequently occuring words in S."
" w = {}
sp = S.split () # Todo:count The number of occurences of each
word in S for
i-SP:
if I not in W:
W[i] = 1
else:
w[i] = 1
# Todo:sort the occurences in descending order (alphabetically in case of ties Top
= sorted (W.items (), KEY=LAMBDA item: (-ITEM[1), item[0])
top_n = top[:n]
# Todo:return the top n mos t frequent words.
return top_n
def test_run (): "" "
Test Count_words () with some inputs.
" "" Print Count_words ("Cat Bat Mat cat Bat Cat", 3)
print count_words ("Betty bought a bit of butter but the butter was bi Tter ", 3)
if __name__ = = ' __main__ ':
test_run ()
Summary
The main two tips: Split () the input string by space, using the sorted () function on the dictionary first by value, then the key to sort, especially the item: (-item[1], item[0]) Represents the descending order of the second element of the item (used before item), and then arranges ascending for the first element. The tuple of multiple elements is the same.