Count the number of duplicate rows in a text in Python

Source: Internet
Author: User
Tags in python

For example, there is a file below

2
3
1
2

We are looking forward to

2,2
3,1
1,1
The way to solve the problem:
The occurrence of the text as a key, the number of occurrences as value, and then excluded by value after output

It is best to output from large to small according to value, you can refer to

The code is as follows Copy Code

In recent Python 2.7, we are have new ordereddict type, which remembers the order in which the items were added.

>>> d = {"Third": 3, "a": 1, "Fourth": 4, "Second": 2}

>>> for K, v. in D.items ():
... print '%s:%s '% (k, v)
...
Second:2
Fourth:4
Third:3
First:1

>>> D
{' Second ': 2, ' fourth ': 4, ' third ': 3, ' a ': 1} To make a new ordered dictionary from the original and sorting by the values:

>>> from collections Import Ordereddict
>>> D_sorted_by_value = ordereddict (sorted (D.items (), Key=lambda x:x[1])) The ordereddict behaves like a normal d Ict:

>>> for K, v. in D_sorted_by_value.items ():
... print '%s:%s '% (k, v)
...
First:1
Second:2
Third:3
Fourth:4

>>> D_sorted_by_value
Ordereddict ([(' I ': 1), (' Second ': 2), (' Third ': 3), (' Fourth ': 4)]

The code is as follows:

  code is as follows copy code

#coding = Utf-8
Import operator

F = open ("F.txt")
Count_dict = {}

for line in F.readlines ():
  & nbsp line = Line.strip ()
    count = Count_dict.setdefault (line, 0)
    count = 1
&N bsp;   Count_dict[line] = count

Sorted_count_dict = sorted (Count_dict.iteritems (), key= Operator.itemgetter (1), reverse=true)

For item in sorted_count_dict:
    print "%s,%d"% (item [0], item[1])

Supplementary Note:
two methods for the Dict object of 1.python:
The items method returns all dictionary entries as a list, each of which comes from (key, value)
The Iteritems method works roughly the same as items, but returns an iterator object instead of a list

2.python built-in function sorted

The code is as follows Copy Code

>>> Help (sorted)

Help on built-in function sorted in module __builtin__:

Sorted (...)
Sorted (iterable, Cmp=none, Key=none, Reverse=false)--> new sorted list

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.