標籤:mat alc div job http app 匯入 9.png https
探索學生對酒的消費情況
資料見github
步驟1 - 匯入必要的庫
import pandas as pdimport numpy as np
步驟2 - 資料集
path4 = "./data/student-mat.csv"
步驟3 將資料命名為student
student = pd.read_csv(path4)student.head()
輸出:
步驟4 從‘school‘到‘guardian‘將資料切片
stud_alcoh = student.loc[: , "school":"guardian"]stud_alcoh.head()
輸出:
步驟5 建立一個捕獲字串的lambda函數
captalizer = lambda x: x.upper()
步驟6 使‘Fjob‘列都大寫
stud_alcoh[‘Fjob‘].apply(captalizer)
輸出:
0 TEACHER1 OTHER2 OTHER3 SERVICES4 OTHER5 OTHER6 OTHER7 TEACHER8 OTHER9 OTHER10 HEALTH11 OTHER12 SERVICES13 OTHER14 OTHER15 OTHER16 SERVICES17 OTHER18 SERVICES19 OTHER20 OTHER21 HEALTH22 OTHER23 OTHER24 HEALTH25 SERVICES26 OTHER27 SERVICES28 OTHER29 TEACHER ... 365 OTHER366 SERVICES367 SERVICES368 SERVICES369 TEACHER370 SERVICES371 SERVICES372 AT_HOME373 OTHER374 OTHER375 OTHER376 OTHER377 SERVICES378 OTHER379 OTHER380 TEACHER381 OTHER382 SERVICES383 SERVICES384 OTHER385 OTHER386 AT_HOME387 OTHER388 SERVICES389 OTHER390 SERVICES391 SERVICES392 OTHER393 OTHER394 AT_HOMEName: Fjob, dtype: object
步驟7 列印資料集的最後幾行元素
stud_alcoh.tail()
輸出:
步驟8 注意到未經處理資料框仍然是小寫字母,接下來改進一下
stud_alcoh[‘Mjob‘] = stud_alcoh[‘Mjob‘].apply(captalizer)stud_alcoh[‘Fjob‘] = stud_alcoh[‘Fjob‘].apply(captalizer)stud_alcoh.tail()
輸出:
步驟9 建立一個名為majority的函數,它返回一個布爾值到一個名為legal_drinker的新列(多數年齡大於17歲)
def majority(x): if x > 17: return True else: return False
stud_alcoh[‘legal_drinker‘] = stud_alcoh[‘age‘].apply(majority)stud_alcoh.head()
輸出:
步驟10 將資料集的每個數字乘以10
def times10(x): if type(x) is int: return 10 * x return x
stud_alcoh.applymap(times10).head(10)
輸出:
pandas練習(四)--- 應用Apply函數