Problem Analysis:
Need to confirm the size of all files under a path
Need to sort, find out the maximum of 10
Save data as a dictionary
Preparation of Knowledge:
Operator module:
Fun = Operator.itemgetter (1), the fun is a function returned by Operator.itemgetter (1), when function fun is acting on an object, returns the value of the 1th dimension of the object being played. For example:
Lis = (6, 56)
Fun (LIS) # return lis[1]
Note: Operator.itemgetter (1) does not return a value, but returns a function that returns the value of the object at an ordinal position when the function is acting on the object.
2. Sorted sorting function
Sorted (iterable[,cmp [, key [, reverse]])
Parameter iterable, specifying the object to be sorted list or Iterable object
Parameter CMP and key are functions that specify the comparison function to use when sorting. The CMP specifies a function of two parameters, and key is a parameter.
Parameter reverse if true, reverse sort
In fact, the role of CMP and key is the same, if you use CMP, you can use a lambda function, such as:
lis= [(1, 2), (2, 4), (5, 3)]fcmp = lambda x, y:cmp (x[1], y[1]) Sorted_lis = sorted (lis, cmp=fcmp) print Sorted_lis
Finally, the output is [(1, 2), (5, 3), (2, 4)], sorted by the second dimension of each member. In fact, the fcmp comparison function x[1] and y[1] Specifies that the comparison is the value of the second dimension of each member of the LIS. The fcmp itself has no value returned, and when it is in the sorted () function, it is used for each member of the LIS to have a value.
If you specify a comparison function with key, it is also possible, but the key specifies when the comparison uses which dimension of the object. Such as:
lis= [(1, 2), (2, 4), (5, 3)]sorted_lis = sorted (Lis, Key=operator.itemgetter (1)) Print Sorted_lis
Finally, the output is also [(1, 2), (5, 3), (2, 4)], because Operator.itemgetter (1) is also a function that returns the value of the second dimension of the object when it acts on the object, as the second dimension of each member of the LIS when it is returned each time for the LIS. Here it specifies the second dimension of each member of the LIS as a comparison.
3. Os.walk (Topdir):
The Os.walk module recursively iterates through all subdirectories and files in the directory, returns an iterator, iterates over it, returns a list of three tuples each time, three tuples representing the top-level directory, all subdirectories in the top-level directory, and all the files currently in the top-level directory, such as:
This is the number of directories for the/root/devops path
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/88/F2/wKiom1gBsUHCu_lNAABYdWuTt8I375.png "title=" pwd " alt= "Wkiom1gbsuhcu_lnaabydwutt8i375.png"/>
Use Os.walk () to traverse the entire path
650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/88/F2/wKiom1gBr57TVrtIAABbn15So0U673.png "title=" Os.walk "alt=" Wkiom1gbr57tvrtiaabbn15so0u673.png "/>
By traversing the iterator returned by Os.walk ('/root/devops '), you can traverse the/root/devops as all the files in the top-level directory. The third element of each returned sequence is the file, so long as it is combined with the first element, it is the absolute path to the file.
4. What data will be used to save the results?
Because, you want to know the file name at the same time, and the size of the files, you can consider using a dictionary to save the file information. Finally, the dictionary is sorted.
With the above knowledge, you can traverse the file, calculate the size, and then sort it out.
#/usr/bin/env pythonimport osimport sysimport operatordef gen_dic ( Topdir): dic = {} a = os.walk (Topdir) for p, d, f in a: for i in f: f_name = os.path.join (p, i) f_size = os.path.getsize (f_name) dic[f_name] = f_size return dic if __name__ == "__main__": try: dic = gen_dir (sys.argv[1]) #遍历目录, save all files and file sizes in the dictionary except indexerror: print "%s follow a dir" % __ File__ sys.exit () sorted_dic = sorted (Dic.iteritems (), key=operator.itemgetter (1), reverse=true) #sorted () method sorting for k, v in sorted_dic[:10]: print k, '---> ', v
This article is from the "dayandnight" blog, make sure to keep this source http://hellocjq.blog.51cto.com/11336969/1862148
Sort by Dictionary value--Find out 10 files of the largest size