Python data Analysis (ii) Pandas missing value processing

Source: Internet
Author: User

ImportPandas as PDImportNumPy as Npdf= PD. DataFrame (Np.random.randn (5, 3), index=['a','C','e','F','h'],columns=[' One',' Both','three']) DF= Df.reindex (['a','b','C','D','e','F','g','h'])Print(DF)Print('############### #缺失值判断 ######################')Print('the missing values of the--------series are judged---------')Print(df[' One'].isnull ())
‘‘‘
The missing values of the--------series are judged---------
A    False
b     truec    falsed     truee    falsef    falseg     trueh    falsename:one, Dtype:bool
" " Print ('---------output series missing values and index--------') Print (df['one'][df['one'].isnull ()])
‘‘‘
---------output series missing values and indexes--------b   NaNd   NaNg   nanname:one, Dtype:float64

'print('--------dataframe missing value---------')print (Df.isnull ())
‘‘‘
--------Dataframe Missing value---------one  Threea  false  false  Falseb   True   True   Truec  false  false  falsed   true   true   truee  false  Falsef  False  false  Falseg   true   true   trueh   false false

"print('--------The missing value and index of the output dataframe---------'= Df[df.isnull ( ). values==True]print(data[~data.index.duplicated ()))
‘‘‘
Missing values and indexes--------output dataframe---------One,  threeb  nan  nan    NaNd  nan  NaN    NaNg  nan  nan     nan

'print('--------output dataframe column with missing values---------')Print (Df.isnull (). any ())
‘‘‘
--------output dataframe columns with missing values---------one      truetwo      truethree    truedtype:bool

"print('############### #缺失值过滤 ######################') Print('missing value filter for--------series---------')print(df[') One'].isnull ())
‘‘‘
############### #缺失值过滤 ######################---------a    falseb     truec    --------Series missing values falsed     truee    falsef    falseg     trueh    falsename:one, Dtype:bool

'print('--------Delete missing data using the Dropna method, return a deleted series--------') Print (df['one'].dropna ())
‘‘‘
--------Use the Dropna method to delete missing data and return a deleted series--------a   -0.211055c   -0.870090e   -0.203259 F    0.490568h    1.437819name:one, Dtype:float64     

'print('--------Dataframe missing values filter---------')print (Df.dropna ())
‘‘‘
--------Dataframe Missing Value filter---------One,     threea-0.211055-2.869212  0.022179C- 0.870090-0.878423  1.071588e-0.203259  0.315897  0.495306F  0.490568-0.968058- 0.999899H  1.437819-0.370934-0.482307    

"print('-------is deleted when the line is all Nan, the parameter how default is any, the missing value is deleted--------') Print(Df.dropna (how="all"))
‘‘‘
-------when the line is all Nan, delete, the parameter how default is any, with the missing value to delete--------one of the     threea-0.211055-2.869212  0.022179c-0.870090-0.878423  1.071588e-0.203259  0.315897  0.495306F  0.490568- 0.968058-0.999899H  1.437819-0.370934-0.482307    
  
" print ( ' ' print ( ' ------Specify a special value to fill the missing value------- " Span style= "color: #000000;" >) print (Df.fillna (0))
"
 ############### #缺失值填充 ######################------Specify special values to fill missing values------- one, threea-0.211055-2 .869212 0.022179  b 0.000000 0.000000 0.000000  c-0.870090-0.878423 1.071588  D 0.000000 0.000000 0. 000000  e-0.203259 0.315897 0.495306  f 0.490568-0.968058-0.999899  g 0.000000 0.000000 0.000000  H 1 .437819-0.370934-0.482307         
  
" print ( '
------Different columns are populated with different values------One,     threea-0.211055-2.869212  0.022179b  1.000000  2.000000  3.000000c-0.870090-0.878423  1.071588d  1.000000  2.000000  3.000000 e-0.203259  0.315897  0.495306f 0.490568-0.968058-0.999899g 1.000000 2.000000 3.000000  H 1.437819-0.370934-0.482307       

Print ('------forward to fill------')print( Df.fillna (method="ffill"))
‘‘‘
------Forward fill------One,     threea-0.211055-2.869212  0.022179b-0.211055-2.869212  0.022179c-0.870090-0.878423  1.071588d-0.870090-0.878423  1.071588e-0.203259  0.315897  0.495306F 0.490568-0.968058-0.999899g 0.490568-0.968058-0.999899H 1.437819-0.370934-0.482307 

Print ('------back to fill------')print( Df.fillna (method="bfill"))
‘‘‘
------Back fill------One,     threea-0.211055-2.869212  0.022179b-0.870090-0.878423  1.071588c-0.870090-0.878423  1.071588d-0.203259  0.315897  0.495306e-0.203259  0.315897  0.495306f 0.490568-0.968058-0.999899g 1.437819-0.370934-0.482307H 1.437819-0.370934- 0.482307       

‘‘‘
Print ('------Average fill------') Print (Df.fillna (Df.mean ()))
‘‘‘
------Average fill------One,     threea-0.211055-2.869212  0.022179b  0.128797-0.954146  0.021373c-0.870090-0.878423  1.071588d  0.128797-0.954146  0.021373e-0.203259  0.315897  0.495306f 0.490568-0.968058-0.999899g 0.128797-0.954146 0.021373H 1.437819- 0.370934-0.482307       

‘‘‘

Python data Analysis (ii) Pandas missing value processing

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.