Use Python to traverse (Linux) subdirectories and find out the specified string with (extension) black-and-white list feature

Source: Internet
Author: User
Tags rar

One of the notes of software & Web Architects _ complements the imperfect second of Linux commands with a Python script:

Traverse (Linux) subdirectories (or folders) with a Python script and find the specified string, and display:

1, directory location;

2, (string) where the file name;

3. The line number where the string is located (in the file);

4. And display the bank's information, showing the 256 characters of the bank (starting from the bank).


#并且支持把扩展名放进脚本中 the list of black and white (extensions) that are stored in an array:

Both in the "white extension" type of file: Search, such as *.txt, *.log, *.cnf, *.conf, *.php and other (type) files contained, find out ...

Then look again at the "black extensions" that are not included, either: not including *.myd, *.jpg, *.bmp, *. RAR, *.class, etc. file, traverse find again ...


The script (code) is as follows:

#!/use/bin/env python
#-*-Coding:utf-8-*-


Import Sys,os
Filte1type = [' conf ', ' cnf ', ' php ', ' txt ', ' log '] #, ' Log '] #白名单


#黑名单
FilterType = [' MYD ', ' MYD ', ' gif ', ' PNG ', ' bmp ', ' jpg ', ' jpeg ', ' rar ', ' zip ',
' ico ', ' apk ', ' IPA ', ' Doc ', ' docx ', ' xls ', ' jar ',
' xlsx ', ' ppt ', ' pptx ', ' pdf ', ' gz ', ' PYc ', ' class '
num = 0


def search (Path=none,cont=none):
If not path or not cont:
Print (' Path or searchstring is empty ')
Return
Global num
Print ("\r\n[Find in Whitelist (extension) file--Find in [Whitename] file!:")
_LOOPFOLDE1R (Path,cont)
Print ("%s file find"% num)
Print ("\r[found in whitelist (extension) file:%s \ r \ n"% num)


Print ("\ r \ n")
Print ("\r\n[" in the file of the shaving blacklist (extension)-Find in [! Blackname!] file!!: ")
_LOOPFOLDE2R (Path,cont)
Print (number of "\r\n[shaving blacklist (extension) files:%s"% num)



def _loopfolde1r (Path,cont): #此过程 (function) is used to check only the whitelist extension
arr = path.split ('/')
If not arr[-1].startswith ('. '):#不检查隐藏文件夹
If Os.path.isdir (path):
Folderlist = Os.listdir (path)
For x in Folderlist:
_LOOPFOLDE1R (path+ "/" +x,cont)
Elif os.path.isfile (PATH):
If Path.split ('. ') [ -1].lower () in Filte1type:
_verifycontent (Path,cont)


def _loopfolde2r (Path,cont): #此过程只检查 (extension) does not contain a blacklist
arr = path.split ('/')
If not arr[-1].startswith ('. '):#不检查隐藏文件夹
If Os.path.isdir (path):
Folderlist = Os.listdir (path)
For x in Folderlist:
_LOOPFOLDE2R (path+ "/" +x,cont)
Elif os.path.isfile (PATH):
If not (Path.split ('. ') [ -1].lower () in FilterType):
_verifycontent (Path,cont)


def _verifycontent (Path,cont):
# If Path.split ('. ') [ -1].lower () in FilterType:
# Return
Global num
FH = open (path, ' R ')
Fhcontent = Fh.readlines ()
Fh.close ()
For index,x in Enumerate (fhcontent):
If cont in x:
num + = 1
Print ("%s%s"% (path,index+1))
Print ("%s"% x[1:256])
Break
Return




if __name__ = = "__main__":
If Len (SYS.ARGV) < 3:
Print ("Invalid parameters")
#sys. argv[1]= "/www"
#sys. argv[2]= "Cknow.net"
Print ("This demo data is just for testing convenience, can be ignored ... Please re-enter the parameters (the user) is OK! ")
Search ("/www", "cknow.net") #这个Demo数据仅仅为了测试方便, can be deleted completely, delete and let the user re-enter the parameters will be OK!
Else

Search (sys.argv[1],sys.argv[2])


----------------------------------------------

Postscript:

--

Python (written) script is really concise!

But put the csdn to Python's "indentation" all messed up ... So I think this is Python's only "nasty" place ...

In the Learning solution ...!!

In order to display indentation, a portion of the code is displayed on the screen ...

The code in the article is complete, but the "indentation" in Python code is "eaten" by Csdn's article publishing system!?




Use Python to traverse (Linux) subdirectories and find out the specified string with (extension) black-and-white list feature

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.