Data import:
Import CSV fromPandasImportREAD_CSV;DF= Read_csv ('D://pa//4.1//1.csv') to import text, to be converted to UTF-8 No BOM format: fromPandasImportREAD_TABLE;DF= Read_table ('D://pa//4.1//2.txt') Import Excle fromPandasImportREAD_EXCEL;DF= Read_excel ('c:/pa/4.1/3.xlsx')
View CodeData export: Export of data:
fromPandasImportDATAFRAME;DF=DataFrame ({' Age': [21, 22, 23], 'name': ['KEN','John','JIMI']});d F.to_csv ("C:/pa/4.1/df.csv");#do not import ordinalDf.to_csv ("C:/pa/4.1/df.csv", Index=false);
View CodeDuplicate value processing:
from Import = read_csv ('c:/pa/4.1/data.csv'= Df.drop_duplicates ();
View CodeMissing value handling:
from Import = read_csv ('c:/pa/4.4/data.csv'= Df.dropna ();
View CodeWhitespace value processing:
from Import = read_csv ('c:/pa/4.5/data.csv'= df["name" ].str.strip ();d f["name"]=newdf;
View CodeField Extraction:
Astype (str) is converted into character data for easy processing.
fromPandasImportREAD_CSV;DF= Read_csv ('C:/pa/4.6/data.csv') df["Tel"]=df["Tel"].astype (str); bands=df["Tel"].str.slice (0,3); Areas=df["Tel"].str.slice (3,7); Numbs=df["Tel"].str.slice (7,11);
View CodeField split:
from Import read_csv;df=read_csv ("c:/pa/4.7/data.csv"); NEWDF=df[" name "]. Str.split ("", 1, True); Newdf.columns=["band", " name "];
View CodeRecord extraction:
Import pandas; from Import read_csv;df=read_csv ("c:/pa/4.8/data.csv", sep="| " );d f[df.comments>1000];d f[df.comments.between (1000,10000)];d f[pandas.isnull ( Df.title)];d f[df.title.str.contains (" Taiwan Power ", na=False)];d f[( Df.comments>=1000) & (df.comments<=10000)]
View CodeRandom sampling:
Import NumPy; from Import read_csv;df=read_csv ("c:/pa/4.9/data.csv"); R=numpy.random.randint (0,10,3);d f.loc[r,:];
View CodeRecord Merge:
Importpandas; fromPandasImportREAD_CSV;DF1=read_csv ("C:/pa/4.10/data1.csv", sep="|");d F2=read_csv ("C:/pa/4.10/data2.csv", sep="|");d F3=read_csv ("C:/pa/4.10/data3.csv", sep="|");d F=pandas.concat ([DF1,DF2,DF3])
View CodeField Merge:
fromPandasImportREAD_CSV;DF=Read_csv ("C:/pa/4.11/data.csv", Sep=" ", Names=['band',' Area','Num']);d f=Df.astype (str); tel= df['band'] + df[' Area'] + df['Num']
View CodeField match:
Importpandas; fromPandasImportRead_csv;item=Read_csv ("C:/pa/4.12/data1.csv", Sep="|", Names=["ID","Comments","title"]);p rices=Read_csv ("C:/pa/4.12/data1.csv", Sep="|", Names=["ID","Oldprice","Newprice"]) itemprices=Pandas.merge (item, Prices, left_on="ID", right_on="ID" );
View CodeSimple calculation:
Import pandas; from Import read_csv;df=read_csv ("c:/pa/4.13/data.csv", sep="| " result=df.price*df.numdf["sum"]=result
View CodeData normalization:
Import pandas; from Import read_csv;df=read_csv ("c:/pa/4.14/data.csv");scale = ( Df.score-df.score.min ())/(Df.score.max ()-df.score.min ())
View CodeData grouping:
Importpandas; fromPandasImportREAD_CSV;DF= Read_csv ("C:\\pa\\4.15\\data.csv", sep='|'); Bins= [min (df.cost)-1, +, +1, Max (df.cost).];labels= ['20 or less','20 to','40 to','60 to','80 to','100 or more'];p andas.cut (df.cost, bins) pandas.cut (df.cost, bins, right=False) pandas.cut (df.cost, bins, right=false, Labels=labels)
View CodeDate conversion:
Import pandas; from Import read_csv; from Import = read_csv ("c:\\pa\\4.16\\data.csv", encoding="utf-8 " );d F_dt=to_datetime (DF. Registration time, format="%y/%m/%d");
View CodeDate formatting:
Importpandas; fromPandasImportread_csv; fromPandasImportTO_DATETIME;DF= Read_csv ("C:\\pa\\4.16\\data.csv", encoding="Utf-8");d F_dt=to_datetime (DF. Registration Time, format="%y/%m/%d");d F_dt_str=df_dt.apply (LambdaX:datatime.strftime (x,"%d-%m-%y"))
View CodeDate Extraction:
import pandas; from pandas import read_csv; from pandas import to_datetime;df = read_csv ( " c:\\pa\\4.18\\data.csv " , encoding= " utf-8 " );d F_dt =to_datetime (DF. Registration time, Format= Span style= "COLOR: #800000" >%y/%m/%d " ); Df_dt.dt.year;df_dt.dt.second;df_dt.dt.minute;df_dt.dt.hour;df_dt.dt.day;df_dt.dt.month;df_dt.dt.weekday;
View Code
Python data analysis-data processing