Python application-data processing (2) Data-specific format changes
The current situation is:
1. There are many data files with such names under my folder.
Part-m-0000
Part-m-0001
Part-m-0002
Part-m-0003
...
2. The data in each folder is in the following format:
"460030730101160", "3", "0", "0", "0", "0:21:42"
"460036745672363", "3", "0", "0", "0", "0:21:31"
"460030250931114", "3", "1307", "1", "0", "0:21:40"
"460030250942643", "3", "0", "0", "0", "0:21:40"
"460036650411006", "3", "1021", "1", "0", "0:21:39"
"000000000009674", "8", "0", "0", "0", "0:12:28"
"000000000005661", "8", "0", "0", "0", "0:12:29"
"460030731390121", "3", "0", "0", "0", "21:54:00"
"460030256111396", "3", "0", "0", "0", "21:54:00"
"460030207447762", "3", "0", "0", "0", "21:53:58"
"460030250939916", "3", "0", "0", "0", "21:53:58"
"460030957972011", "3", "1613", "0", "0", "21:53:51"
"460030237206739", "3", "0", "0", "0", "21:53:59"
...
Now we need to remove the quotation marks on the number and extract the hour of the last column. The following is the process of processing in python:
1. traverse all files starting with 'part' in the current folder;
2. read each row of each file and separate it according;
3. read each part and take the hour part of the last time in the middle of the quotation marks. here we need to judge whether the number of digits of the hour is 1 or 2;
4. Write one row for each read row
The following are the specific items to be bought:
#coding: utf-8import osfor root,dir,files in os.walk("./"): for file in files: if file.startswith("part"): filepath = "./"+file #This is the current file path print filepath newfilepath = "./data_handled/"+file[7:] # This is file used to write into file = open(filepath) newfile = open(newfilepath,'w') for line in file: string = "" line_ = line.split(',') for i in range(len(line_)-1): j = line_[i][1:len(line_[i])-1] #Delte the " " string += j string += ',' len1 = len(line_) if len(line_[len1-1]) > 12: if line_[len1-1][12]==':': k = line_[len1-1][11:12] else: k = line_[len1-1][11:13] else : k = "-1" string += k newfile.write(string+"\n") newfile.close()