A brief introduction to Python's Pandas library

Last Update:2017-06-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Pandas is the data analysis processing library for Python
Import Pandas as PD

1. read CSV, TXT file

Foodinfo = Pd.read_csv ("pandas_study.csv""utf-8")

2, view the first n, after n information

Foodinfo.head (n) foodinfo.tail (n)

3, check the format of the data frame, is dataframe or Ndarray

Print (Type (foodinfo)) # results: <class ' pandas.core.frame.DataFrame ' >

4. See what columns are available

Foodinfo.columns

5, see a few rows of several columns

Foodinfo.shape

6. Print a line, a few rows of data

foodinfo.loc[0]foodinfo.loc[0:2]foodinfo.loc[[2, 5, ten]]    # Note that the inside is an array

7. Print a column, a few columns of data

foodinfo["dti"]foodinfo[["int_rate " " DTI "]]    # Note that the inside is an array # or:columns = ["int_rate""dti"] Foodinfo[columns]

8. Print data types for all columns

Foodinfo.dtypes

9, some related operations on the column

Col_columns == [] for in col_columns:if c.endswith (" s " ): New_columus.append (c)print(c) Foodinfo[new_columus]

10, Subtraction: Multiply each line by 100 (subtraction same)

foodinfo[["int_rate""dti"] * 100

11. Add a column

New_col = foodinfo["int_rate"] *foodinfo["new_col "] = New_col

12. Operations between columns

foodinfo["dti"] * foodinfo["int_rate"]

13. View the maximum, minimum, and average values for a column

foodinfo["int_rate"].max () foodinfo["int_rate"  ].min () foodinfo["int_rate"].mean ()

14. Sort by a field-ascending

# InPlace whether to create a new dataframe,true does not require foodinfo.sort_values ("int_rate_one", inplace = True) # Sort by a field-descending foodinfo.sort_values ("int_rate_one", inplace = True, ascending = False)

15. View some properties of the data frame: maximum, minimum, mean, four-digit, etc.

Foodinfo.describe ()

16. Null Value related Operations

Pin = foodinfo["pin"#  See all empty values #  Find all empty lines len (pin_isnull_list)        # number of NULL values

17, missing value related operations

# The simple approach is to filter out null values books = foodinfo["life_cycle_books"== foodinfo["life_cycle_books"][book_isnull= = SUM (book_list_isnull)/Len ( Book_list_isnull)    #  calculate Average

18, according to the criteria to print a column of data

foodinfo[foodinfo["life_cycle_books"] = = 1]

19. Pivot Table

Import NumPy as NP # Index: The column to pivot # values: The relationship columns to compare # Aggfunc: Specific relationship, default value: Np.meandata_foodinfo = foodinfo.pivot_table (index = ["life_cycle_books  "potential_value_books" "risk_level ", Aggfunc = np.mean)print(data_foodinfo)

20. Delete missing values

# All lines na_foodinfo = Foodinfo.dropna (axis = 1)#  You can specify the column na_foodinfo = Foodinfo.dropna ( Axis = 0, subset = ["life_cycle_books" "potential_value_books" ])

21, free to take data such as: Take 80 rows life_cycle_books column

" Life_cycle_books "]

22. Re-rank Index

Foodinfo.reset_index (drop = True)

23. Custom Function: Returns the number of empty values

def count_null_columns (column):     = pd.isnull (column)    = Column[column_null]    = len (list_null)    return count_nullfoodinfo.apply (count_null_columns)

24. Series

# pandas three types of data structures # Series # DataFrame # Panel  from Import Series

25. Series shows a column of data

Series_name = taitan["name"]series_name.values

26. Positioning a row of a column

Series_name = taitan["name"= taitan["Age"  = Series (series_age.values, index = series_name) series_custom[["Ahlin, Mrs Johan (Johanna Persdotter Larsson)""asplund, Mrs Carl Oscar (Selma Augusta Emilia Johansson) c14> "]#  Description: series_custom[" "] by Column series_custom[[" "]] by row

27, take 5-10 rows of data, and the same as above:

SERIES_CUSTOM[5:10]

28. Index Transformation

Old_index == = Series_custom.reindex (sort_index)print(new_index)

29. A series of functions sorted by index and value

SC1 = series_custom.sort_index ()print= series_custom.sort_values ()Print (SC2)

30. Series Filter

Series_custom > 0.5> 0.5> 0.5) & (Series_custom < 0.9)]#  Note: &, | They're all single symbols .

31, DataFrame

# Series is a row of data, Dataframe is multiple rows of data # DataFrame can be seen as df = pd.read_csv ("titanic_train.csv") consisting of multiple Series

32. Index transformation of Dataframe

# whether the drop creates a new Df,true false Yes (indicates that a column of name is also retained, otherwise it cannot be evaluated)Df_name = Df.set_index ("name" , drop = False)

33. Dataframe view data of a certain type

types ="float64"].indexdf_name[float_columns]

34, Dataframe to find the variance

FLOAT_DF = df_name[float_columns]float_df.apply (Lambda x:np.std (x))

A brief introduction to Python's Pandas library

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A brief introduction to Python's Pandas library

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A brief introduction to Python's Pandas library

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support