If you download the folder on SkyDrive, all the files will be packaged into a zip package. When the compressed file is decompressed, it is found that all the files with Chinese names have been renamed into the format of file1.txt. At the same time, an additional file named "encoding error.txt" will be generated ". This file indicates that the downloaded file is renamed to the SkyDrive server. The first three lines of the file are:
Due to the limitations of the supported zip file format, the following file(s) had to be renamed.Original File Name -> New File Name
Each line below shows the names of all the renamed files in the format of the third line, such as a file example-> file1.pdf ". Fortunately, the file extension name has not been modified. Obviously, these files are inconvenient to use after being renamed, so it is best to restore the original name. If you manually modify it, you will have to learn so many programming languages. This type of mechanical repetitive work will certainly be done using code, such as Python. SkyDrive cannot handle the problem of Chinese file names. I am sorry for Microsoft's signboard. Fortunately, it still provides the file "encoding error.txt.
The basic idea of restoring the file name is to first read the file names before and after renaming from encoding error.txt, and then modify them one by one. There are not many content involved. One is to open and read the file (for encoding error.txt). You can use the built-in function open and the readlines method of the file object, of course, the strings to be read will be processed afterwards. The simplest way is to split the strings, because the format of each line in the encoding error.txt file is relatively simple; the other is file operations, you can use some functions or methods in the OS module. For example, you can use the OS. some methods in path, but I use OS. system, that is, calling system commands. in windows, "RENAME oldname. extension newname. extension ". The Python OS module does not provide other more direct rename methods. The advantage is that you do not have to consider Exception Handling issues. The disadvantage is that you can only run on Windows.
According to this idea, you can write the working Python code, but in fact you need to handle other unexpected problems, mainly referring to the file encoding problem. The document "encoding error.txt" is encoded as UTF-8 + Bom. When open and readlines are used, the Chinese content cannot be correctly displayed. decode the content. Each line of string to be read requires special handling of the Line Break "\ r \ n" and directly replace it into a null character. Of course, this is related to the split practice and may not be necessary. Using OS. System to execute Windows commands should actually execute the CMD/K Command operation, that is, to run the bat command, you need to convert it to the input format supported by the Windows command line, You need to encode it.
The Code is as follows:
# -*- coding: utf-8 -*- import osenc = open("Encoding Errors.txt","r").readlines()for i in range(len(enc)):if i < 3:continueenc[i] = enc[i].replace('\n','').replace('\r','')enc[i] = enc[i].decode("utf-8",'ignore')l = enc[i].split("->")command = "rename %s %s" % (l[1],l[0])print command.encode("cp936")if os.system(command.encode("cp936")) == 0 :print'Done!'
Although the Code is short, it takes some time to completely modify the code and let it work. The main focus is on coding, which cannot be avoided by Chinese programmers. Now we are encountering more and more problems in this area. It is necessary to study it systematically.