Use Python to count high-frequency words __python

Source: Internet
Author: User

Reprint please indicate the source: http://blog.csdn.net/cxsydjn/article/details/70991846 problem

(from Udacity machine learning Engineer Nano Degree preview course)

Use Python to implement the function Count_words (), which enters the string s and number n, and returns the most frequently occurring words in S. The return value is a tuple list that contains the highest number of n words and their number of occurrences, that is, [(< Word 1>, < times 1>), (< Word 2>, < times 2>), ...], sorted in descending order of occurrences.

You can assume that all input is in lowercase and do not contain punctuation or other characters (only letters and single spaces). If they occur the same number of times, they are sorted in alphabetical order.

For example:

Print Count_words ("Betty bought a bit of butter but the butter was bitter", 3)

Output

[(' Butter ', 2), (' A ', 1), (' Betty ', 1)]
Solution
"" "Count words.

" "" def count_words (S, N): "" "Return of the
    n most frequently occuring words in S."
    " w = {}
    sp = S.split () # Todo:count The number of occurences of each
    word in S for
    i-SP:
        if I not in W:
            W[i] = 1
        else:
            w[i] = 1

    # Todo:sort the occurences in descending order (alphabetically in case of ties Top
    = sorted (W.items (), KEY=LAMBDA item: (-ITEM[1), item[0])
    top_n = top[:n]
    # Todo:return the top n mos t frequent words.
    return top_n


def test_run (): "" "
    Test Count_words () with some inputs.
    " "" Print Count_words ("Cat Bat Mat cat Bat Cat", 3)
    print count_words ("Betty bought a bit of butter but the butter was bi Tter ", 3)


if __name__ = = ' __main__ ':
    test_run ()
Summary

The main two tips: Split () the input string by space, using the sorted () function on the dictionary first by value, then the key to sort, especially the item: (-item[1], item[0]) Represents the descending order of the second element of the item (used before item), and then arranges ascending for the first element. The tuple of multiple elements is the same.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.