2.2 To filter specific rows:
- A value in a row satisfies a condition
- Values in rows belong to a collection
- A value in a row matches a pattern (that is, a regular expression)
2.2.1: Values in rows are satisfied with a condition:
1 #! /usr/bin/env Python32 ImportCSV3 ImportSYS4 5Input_file = sys.argv[1]6Output_file = sys.argv[2]7 8With open (Input_file,'R', newline ="') as Csv_in_file:9With open (Output_file,'W', newline ="') as Csv_out_file:TenFileReader =Csv.reader (csv_in_file) OneFileWriter =Csv.writer (csv_out_file) AHeader =Next (FileReader) # reads the first line of the input file using the next function of the CSV module - Filewriter.writerow (header) # writes the title to the output file - forRow_listinchFileReader: theSupplier =str (row_list[0]). Strip () # Remove the supplier name from each row of data and assign to the supplier variable -Cost = str (row_list[3]). Strip ('$'). Replace (',',"') # Use list index - ifSupplier = ='Supplier Z' orFloat (cost) > 600.0: -Filewriter.writerow (Row_list)
- Version Pandas:
#!/usr/bin/env Python3import Pandas as pdimport sysinput_file = Sys.argv[1]output_file = Sys.argv[2]data_frame = Pd.read_ CSV (input_file) data_frame[' cost ' = data_frame[' cost '] = data_frame[' cost '].str.strip (' $ '). Astype (float) data_frame _value_meets_condition = data_frame.loc[(data_frame[' Supplier Name '].str.contains (' Z ')) | (data_frame[' cost '] > 600.0),:]data_frame_value_meets_condition.to_csv (output_file, index = False)
2.2.2: A value in a row belongs to a collection:
1 #!/usr/bin/env Python3 # Requirements Purpose: Keep those purchase days belonging to [' 1/20/14 ', ' 1/30/2014 ']2 ImportCSV3 ImportSYS4 5Input_file = sys.argv[1]6Output_file = sys.argv[2]7 8Important_dates = ['1/20/2014','1/30/2014'] # creates a list named Important_dates collection, Important_dates is a list variable, which is the collection to belong to 9 TenWith open (Input_file,'R', newline =' ') as Csv_in_file: OneWith open (Output_file,'W', newline =' ') as Csv_out_file: AFileReader =Csv.reader (csv_in_file) # using the reader function of the CSV module, create a file read object named FileReader, which can be used to read the lines in the article -FileWriter =Csv.writer (csv_out_file) # using the writer function of the CSV module, create a file output object named FileWriter, which can be used to write the object's data to the output file -Header =Next (FileReader) # Use the next function of the CSV module to read the first line of the input file the Filewriter.writerow (header) # header--header line, write to output file - forRow_listinchFileReader: # Traverse each line of the read article -A_date = row_list[4] # Gets the 5th column of information for each row, which is the purchase information for each line, and assigns it to the variable a_date; Here's the index value 4. - ifA_dateinchimportant_dates: # Determine if the variable a_date belongs to the Important_dates collection +Filewriter.writerow (row_list) # If yes, writes the row data to the output file
Pandas
1 #!/usr/bin/env Python32 3 ImportPandas as PD4 ImportSYS5 6Input_file = sys.argv[1]7Output_file = sys.argv[2]8 9Data_frame =Pd.read_csv (input_file) # reads the input file and reads it into Dataframe formTenData_frame_value_in_set = data_frame.loc[data_frame['Purchase Date'].isin (Important_dates),:] # Pandas's Concise command: Isin () One AData_frame_value_in_set.to_csv (Output_file,index = False) # Converts the value of the Data_frame_value_in_set variable into CSV form and writes to the output file
A value in a row that matches a regular expression:
1 #!/usr/bin/env Python32 ImportCSV3 ImportRe # import regular expression module (re)4 ImportSYS5Input_file = sys.argv[1]6Output_file = sys.argv[2]7Pattern = Re.compile (r'(? p<my_pattern_group>^001-.*)', Re. I) # Use the Compile function of the RE module to create a variable with a regular expression named pattern8With open (Input_file,'R', newline =' ') as Csv_in_file:9With open (Output_file,'W', newline =' ') as Csv_in_file:TenFileReader =Csv.reader (csv_in_file) OneFileWriter =Csv.writer (csv_out_file) AHeader =Next (FileReader) - Fliewriter.writerow (header) - forRow_listinchFileReader: theInvoice_number = row_list[1] # - ifPattern.search (Invoice_number): # Use the RE module's search function to find patterns in the value of Invoice_number -Filewriter.writerow (row_list) # If the pattern appears in Invoice_number, write the line to the output file
1 #! /usr/bin/env Python32 3 ImportPandas as PD4 ImportSYS5 6Input_file = sys.argv[1]7Output_file = sys.argv[2]8 9Data_frame =pd.read_csv (input_file)TenData_frame_value_matches_pattern = data_frame.loc[data_frame['Invoice Number'].str.startswith ("001-"), :] OneData_frame_value_matches_pattern.to_csv (output_file, index = False)
Python Data Analysis Basics-read/write CSV file 2