Python Data Analysis Basics-read/write CSV file 2

Source: Internet
Author: User

2.2 To filter specific rows:

    • A value in a row satisfies a condition
    • Values in rows belong to a collection
    • A value in a row matches a pattern (that is, a regular expression)

2.2.1: Values in rows are satisfied with a condition:

    • Basic Python version:

    

1 #! /usr/bin/env Python32 ImportCSV3 ImportSYS4 5Input_file = sys.argv[1]6Output_file = sys.argv[2]7 8With open (Input_file,'R', newline ="') as Csv_in_file:9With open (Output_file,'W', newline ="') as Csv_out_file:TenFileReader =Csv.reader (csv_in_file) OneFileWriter =Csv.writer (csv_out_file) AHeader =Next (FileReader) # reads the first line of the input file using the next function of the CSV module - Filewriter.writerow (header) # writes the title to the output file -          forRow_listinchFileReader: theSupplier =str (row_list[0]). Strip () # Remove the supplier name from each row of data and assign to the supplier variable -Cost = str (row_list[3]). Strip ('$'). Replace (',',"') # Use list index -             ifSupplier = ='Supplier Z' orFloat (cost) > 600.0: -Filewriter.writerow (Row_list)

    • Version Pandas:
      #!/usr/bin/env Python3import Pandas as pdimport sysinput_file = Sys.argv[1]output_file = Sys.argv[2]data_frame = Pd.read_ CSV (input_file) data_frame[' cost ' = data_frame[' cost '] = data_frame[' cost '].str.strip (' $ '). Astype (float) data_frame _value_meets_condition = data_frame.loc[(data_frame[' Supplier Name '].str.contains (' Z ')) | (data_frame[' cost '] > 600.0),:]data_frame_value_meets_condition.to_csv (output_file, index = False)

2.2.2: A value in a row belongs to a collection:

    • Basic Python:

    

1 #!/usr/bin/env Python3 # Requirements Purpose: Keep those purchase days belonging to [' 1/20/14 ', ' 1/30/2014 ']2 ImportCSV3 ImportSYS4 5Input_file = sys.argv[1]6Output_file = sys.argv[2]7 8Important_dates = ['1/20/2014','1/30/2014'] # creates a list named Important_dates collection, Important_dates is a list variable, which is the collection to belong to 9 TenWith open (Input_file,'R', newline =' ') as Csv_in_file: OneWith open (Output_file,'W', newline =' ') as Csv_out_file: AFileReader =Csv.reader (csv_in_file) # using the reader function of the CSV module, create a file read object named FileReader, which can be used to read the lines in the article -FileWriter =Csv.writer (csv_out_file) # using the writer function of the CSV module, create a file output object named FileWriter, which can be used to write the object's data to the output file -Header =Next (FileReader) # Use the next function of the CSV module to read the first line of the input file the Filewriter.writerow (header) # header--header line, write to output file -          forRow_listinchFileReader: # Traverse each line of the read article -A_date = row_list[4] # Gets the 5th column of information for each row, which is the purchase information for each line, and assigns it to the variable a_date; Here's the index value 4. -             ifA_dateinchimportant_dates: # Determine if the variable a_date belongs to the Important_dates collection +Filewriter.writerow (row_list) # If yes, writes the row data to the output file

Pandas

  

1 #!/usr/bin/env Python32 3 ImportPandas as PD4 ImportSYS5 6Input_file = sys.argv[1]7Output_file = sys.argv[2]8 9Data_frame =Pd.read_csv (input_file) # reads the input file and reads it into Dataframe formTenData_frame_value_in_set = data_frame.loc[data_frame['Purchase Date'].isin (Important_dates),:] # Pandas's Concise command: Isin ()  One  AData_frame_value_in_set.to_csv (Output_file,index = False) # Converts the value of the Data_frame_value_in_set variable into CSV form and writes to the output file

A value in a row that matches a regular expression:

    • Basic Python

    
1 #!/usr/bin/env Python32 ImportCSV3 ImportRe # import regular expression module (re)4 ImportSYS5Input_file = sys.argv[1]6Output_file = sys.argv[2]7Pattern = Re.compile (r'(? p<my_pattern_group>^001-.*)', Re. I) # Use the Compile function of the RE module to create a variable with a regular expression named pattern8With open (Input_file,'R', newline =' ') as Csv_in_file:9With open (Output_file,'W', newline =' ') as Csv_in_file:TenFileReader =Csv.reader (csv_in_file) OneFileWriter =Csv.writer (csv_out_file) AHeader =Next (FileReader) - Fliewriter.writerow (header) - forRow_listinchFileReader: theInvoice_number = row_list[1] # - ifPattern.search (Invoice_number): # Use the RE module's search function to find patterns in the value of Invoice_number -Filewriter.writerow (row_list) # If the pattern appears in Invoice_number, write the line to the output file

    • Pandas

1 #! /usr/bin/env Python32 3 ImportPandas as PD4 ImportSYS5 6Input_file = sys.argv[1]7Output_file = sys.argv[2]8 9Data_frame =pd.read_csv (input_file)TenData_frame_value_matches_pattern = data_frame.loc[data_frame['Invoice Number'].str.startswith ("001-"), :] OneData_frame_value_matches_pattern.to_csv (output_file, index = False)

Python Data Analysis Basics-read/write CSV file 2

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.