Article Architecture
Scene DescriptionDuring data mining, you will encounter the need to process/process multiple columns (series). For example, to calculate the and of some selected columns, to stitch some columns to form new columns (for filtering comparisons), and so on. Bowen through a small example, to solve the above requirements of the implementation process. Sometimes, some data needs to be sorted according to the implementation of the latest, so the article also involves the process of converting the time series to timestamp, which is recorded in this article.
Demo Experiment
use and calculation of apply function
# pandas.core.frame.DataFrame.apply calculates and import of selected columns
random
import numpy as NP
import pandas as PD
ns=[]< c8/>## storage test data for
N in range (6): # # Generate 6 test data
ns.append ([Random.randint (60,100), Random.randint (60,100) , Random.randint (50,100)])
SCORES_DF=PD. Dataframe (Ns, columns=[' Chinese ', ' Math ', ' 中文版 ']) # # List converted to Dataframe '
calculate total score Function
'
def get_total_score (c=1,m=2,e=3): return
c+m+e
if __name__== ' __main__ ':
assert Get_total_ Score () ==6
assert Get_total_score (3,3,3) ==9
# # Calculates the value of the selected column (all columns) and
scores_df[' Total_score ']=scores_ Df.apply (Lambda Row:get_total_score (row[0],row[1],row[2]), Axis=1)
SCORES_DF
Data Effects
Apply other parameters use effect
# examples
df.apply (numpy.sqrt) # returns Dataframe
# EQUIV to: Equivalent
df.apply (numpy.sum, axis=0) # EQUIV to DF . SUM (0) computed column
df.apply (numpy.sum, Axis=1) # equiv to Df.sum (1) Compute rows
Calculation effect
Root
Sum by line
Sum by Column
Apply function concatenation string
Using the Custom Function (Get_total_score) combined with the Appaly function, the selection column (series) summation is implemented, and the custom function can be modified to complete the processing of the String column (series) as the new column (filter, filter). Because the implementation is simple, here is no longer to repeat, the reader can implement.
mktime time series and timestamp (int) Conversion Code Implementation
'
time series, timestamp conversion '
def s_to_timestamp (time_s= ' 2018-03-07 12:00:07 '):
import times
struct_ Time=time.strptime (time_s, '%y-%m-%d%h:%m:%s ') return
int (time.mktime (struct_time))
if __name__== ' __main_ _ ':
print ('%s <==> %d '% (time_s, S_to_timestamp ()))
print ('%s <==> %d '% (') 2018-03-08 12:00:07 ', S_to_timestamp (' 2018-03-08 12:00:07 '))
Reference Python time Mktime () method. W3cschool Python time.mktime () examples pandas. Dataframe.apply python time, date, time stamp conversion. Recommended Pandas:how to the Apply function to multiple columns. Recommended