Sort by Dictionary value--Find out 10 files of the largest size

Source: Internet
Author: User
Tags iterable

Problem Analysis:

    1. Need to confirm the size of all files under a path

    2. Need to sort, find out the maximum of 10

    3. Save data as a dictionary


Preparation of Knowledge:

    1. Operator module:

Fun = Operator.itemgetter (1), the fun is a function returned by Operator.itemgetter (1), when function fun is acting on an object, returns the value of the 1th dimension of the object being played. For example:

Lis = (6, 56)

Fun (LIS) # return lis[1]

Note: Operator.itemgetter (1) does not return a value, but returns a function that returns the value of the object at an ordinal position when the function is acting on the object.


2. Sorted sorting function

Sorted (iterable[,cmp [, key [, reverse]])

Parameter iterable, specifying the object to be sorted list or Iterable object

Parameter CMP and key are functions that specify the comparison function to use when sorting. The CMP specifies a function of two parameters, and key is a parameter.

Parameter reverse if true, reverse sort


In fact, the role of CMP and key is the same, if you use CMP, you can use a lambda function, such as:

lis= [(1, 2), (2, 4), (5, 3)]fcmp = lambda x, y:cmp (x[1], y[1]) Sorted_lis = sorted (lis, cmp=fcmp) print Sorted_lis

Finally, the output is [(1, 2), (5, 3), (2, 4)], sorted by the second dimension of each member. In fact, the fcmp comparison function x[1] and y[1] Specifies that the comparison is the value of the second dimension of each member of the LIS. The fcmp itself has no value returned, and when it is in the sorted () function, it is used for each member of the LIS to have a value.


If you specify a comparison function with key, it is also possible, but the key specifies when the comparison uses which dimension of the object. Such as:

lis= [(1, 2), (2, 4), (5, 3)]sorted_lis = sorted (Lis, Key=operator.itemgetter (1)) Print Sorted_lis

Finally, the output is also [(1, 2), (5, 3), (2, 4)], because Operator.itemgetter (1) is also a function that returns the value of the second dimension of the object when it acts on the object, as the second dimension of each member of the LIS when it is returned each time for the LIS. Here it specifies the second dimension of each member of the LIS as a comparison.


3. Os.walk (Topdir):

The Os.walk module recursively iterates through all subdirectories and files in the directory, returns an iterator, iterates over it, returns a list of three tuples each time, three tuples representing the top-level directory, all subdirectories in the top-level directory, and all the files currently in the top-level directory, such as:

This is the number of directories for the/root/devops path

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/88/F2/wKiom1gBsUHCu_lNAABYdWuTt8I375.png "title=" pwd " alt= "Wkiom1gbsuhcu_lnaabydwutt8i375.png"/>


Use Os.walk () to traverse the entire path

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/88/F2/wKiom1gBr57TVrtIAABbn15So0U673.png "title=" Os.walk "alt=" Wkiom1gbr57tvrtiaabbn15so0u673.png "/>

By traversing the iterator returned by Os.walk ('/root/devops '), you can traverse the/root/devops as all the files in the top-level directory. The third element of each returned sequence is the file, so long as it is combined with the first element, it is the absolute path to the file.


4. What data will be used to save the results?

Because, you want to know the file name at the same time, and the size of the files, you can consider using a dictionary to save the file information. Finally, the dictionary is sorted.


With the above knowledge, you can traverse the file, calculate the size, and then sort it out.

#/usr/bin/env  pythonimport osimport sysimport operatordef   gen_dic ( Topdir):     dic = {}    a = os.walk (Topdir)      for p, d, f in a:         for i in f:            f_name  = os.path.join (p, i)             f_size  = os.path.getsize (f_name)              dic[f_name] = f_size    return dic         if __name__ ==  "__main__":    try:         dic = gen_dir (sys.argv[1])   #遍历目录, save all files and file sizes in the dictionary      except indexerror:        print  "%s follow a dir"  % __ File__        sys.exit ()     sorted_dic =  sorted (Dic.iteritems (),  key=operator.itemgetter (1),  reverse=true)   #sorted () method sorting      for k, v in sorted_dic[:10]:         print k,  '---> ',  v





This article is from the "dayandnight" blog, make sure to keep this source http://hellocjq.blog.51cto.com/11336969/1862148

Sort by Dictionary value--Find out 10 files of the largest size

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.