Python Find spaces and Chinese

Source: Internet
Author: User

Objective

Under the picture or folder, the name is not standardized, there is Chinese or there are spaces. This script batch lookup, and output to txt for easy modification, can also be extended to

Direct script to remove spaces and so on. Currently only used in Windows, Mac does not test, do not know whether can line, if necessary, modify it yourself. Big God don't squirt 、、、、

Code

The Python code is as follows:

#Coding=utf-8#2015.12.14 Windows version#Find all spaces names or spaces #-*-Coding:utf-8-*-#If you want to remove the space, you can use replace ("", "")#get the current path, all directly double-click, or cmd in the current directory to execute the Python script#check spaces and ChineseImportOs,sysImportOs.pathImportRerootdir=os.getcwd () Zhpattern= Re.compile (U'[\u4e00-\u9fa5]+')defStart (RootDir): forFinchOs.listdir (RootDir): Sourcef=Os.path.join (rootdir,f)ifOs.path.isfile (SOURCEF): A, B= Os.path.splitext (f)#Remove ExtensionCheckName (a)ifOs.path.isdir (SOURCEF): CheckName (f) Start (SOURCEF)#file Array"""Note the encoding format for this place. The encoding format for Windows file names is GBK"""defCheckName (f):#ff = f.decode (' GBK '). Encode (' Utf-8 ')FF = F.decode ('GBK')    #print (FF)Match = Zhpattern.search (ff)#Match Chinese    ifmatch:Print(FF) chinese.append (f) forIinchF:ifI.isspace ():#Check Spaces               PrintF Name.append (f)#output to txtdefWirte (): F= Open (rootdir+"/checkreslut.txt","w+") F.write ("space: \ n")     forIinchRange (0, Len (name)): F.write (Name[i]+"\ n") F.write ("\nchinese: \ n")     forIinchRange (0, Len (Chinese)): F.write (Chinese[i]+"\ n") F.close ()if __name__=="__main__": Name=[] Chinese=[] Start (RootDir) Wirte () Os.system ("Pause")
Explanation 1. OS.GETCWD ()

Gets the current path. Note that the function does not need to pass parameters, it returns the current directory. It is necessary to note that the current directory does not refer to the directory in which the script resides, but rather the directory where the script is run.

If you set the Python environment variable, you can run it directly by double-clicking. The second method is CD to the current directory, of course the more troublesome, you hold down the SHIFT key, right click on the mouse, the current position will appear to open cmd.

If you're too much trouble. You can modify the registry directly by right-clicking the

2. Chinese string Matching
Import rezhpattern = re.compile (U ' [\u4e00-\u9fa5]+ ')

3.if i.isspace ():

Determine if there are spaces. You can also use other methods ....

4.python file Operation

f = Open Opens a document: The second parameter is the open mode.

#open (path + file name, read-write mode)
#读写模式: R read-only, r+ read/write, W New (overwrites the original file), a append, b binary. Common mode

RU or Ua opens in read-only mode with universal line-feed support (PEP 278)
W opens in write mode,
A opens in Append mode (starting with EOF and creating a new file if necessary)
R+ Open in read-write mode
w+ Open in read/write mode (see W)
A + opens in read/write mode (see a)
RB opens in binary read mode
WB opens in binary write mode (see W)
AB opens in binary append mode (see a)
Rb+ opens in binary read/write mode (see r+)
Wb+ opens in binary read/write mode (see w+)
Ab+ opens in binary read/write mode (see A +)

Attention:

1, use ' W ', if the file exists, first to empty, and then (re) create,

2, using the ' a ' mode, all the data to be written to the file is appended to the end of the file, even if you use Seek () to point to the file somewhere else, if the file does not exist, it will be created automatically.



F.read ([size]) size unspecified returns the entire file if the file size >2 memory is problematic. F.read () returns "" (empty string) when the end of the file is read

File.readline () returns a row

File.readline ([size]) returns a list that contains a size row, and returns all rows if the size is unspecified

For F:print Line #通过迭代器访问

F.write ("hello\n") #如果要写入字符串以外的数据, first convert him to a string.

F.tell () returns an integer that represents the position of the current file pointer (that is, the number of bits to the file header).

F.seek (offset, [start position])

Used to move the file pointer

Offset: unit: bit, can be negative

Starting position: 0-File header, default value; 1-current position; 2-End of file

F.close () Close file

Read and write file operation files, can compare read and write operations in other languages. Find commonalities and differences,

Summarize

For the folder file traversal, file search, batch renaming and so on, can use the script, high efficiency, save time.

What needs to be done is: cannot be limited to implementation. Also to optimize, contrast, which way faster and more efficient.

Make a little progress every day, and look back a year later, how many steps you have taken.

Reprint please indicate the source.

Python Find spaces and Chinese

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.