Pandas data processing based on filtering specified rows or columns

Last Update:2018-05-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly introduces the pandas data processing basis to filter the specified row or the specified column of the relevant information, the need for friends can refer to the following

The main two data structures of Pandas are: series (equivalent to one row or column of data bodies) and dataframe (a tabular data body equivalent to multiple rows and columns).

This article is intended to facilitate understanding of the associative analogy with Excel or SQL operations rows or columns

1. Re-index: Reindex and IX

The default row index after the data read is described in the previous article is 0,1,2,3 ... this sequence number. The column index is equivalent to the field name (that is, the first row of data), where re-indexing means that the default index can be re-modified to look the way you want.

1.1 Series

For example: data=series ([4,5,6],index=[' A ', ' B ', ' C '), row index is a,b,c.

We use Data.reindex ([' A ', ' C ', ' d ', ' e ']) to modify the index and then output:

It can be understood that we set the index with Reindex, according to the index to the original data match the corresponding value, no match is Nan.

1.2 DataFrame

(1) row index modification: Dataframe row index Same series

(2) Column index modification: Carnaby references Reindex (columns=[' M1 ', ' m2 ', ' m3 '), and uses the parameter columns to specify the modification of the column index. Modifying a logical similar row index is equivalent to using a new column index to match the original data, not matching the set Nan

Cases:

(3) Simultaneous modification of row and column indexes is possible with

2. Discard columns on the specified axis (popular parlance is to delete rows or columns):d ROP

Select by index to delete which row or column

data.drop(['a','c']) 相当于delete table a where xid='a' or xid='c'

data.drop('m1',axis=1)相当于delete table a where yid='m1'

3. Select and filter (in layman's terms filter queries by criteria in SQL)

Because there are row and column indexes in Python, it is more convenient to do the data filtering

3.1 Series

(1) Select by row index as

Obj[' B ' is equivalent select * from tb where xid='b'obj['b','a','c'] select * from tb where xid in ('a','b','c') , and the results are shown in the order of B, A, C, which is the difference from SQL Obj[0:1] and obj[' a ': ' B ') as follows:

#前者是不包含末端, the latter is contained in the end

(2) Filtering by the size of the value obj[obj>-0.6] is equivalent to finding a record with a value greater than 0.6 in the obj data to show

3.2 DataFrame

(1) Select single line with IX or xs:

For example, the row record that filters index B is in the following three ways

(2) Select multiple lines:

How to filter two rows of records indexed to a, b

#以上不能直接写成data [[' A ', ' B ']]

Data[0:2] represents the record from the first row to the second row. The first line defaults from 0, and does not contain the end of 2.

(3) Select single column

Filter all row record data for M1 columns

(4) Select multiple columns

Filter m1,m3 two columns, all rows of recorded data

Ix[:,[' M1 ', ' m2 ']] before: Indicates that all rows are filtered in.

(5) Filter rows or columns based on the size criteria of the values

For example, filtering out all records with a column value greater than 4 is equivalent to the SELECT * from TB where column name >4

(6) If you filter all records with a column value greater than 4, and you only need to show some of the columns

Rows are filtered by criteria, and columns use [0,2] to filter data for the first and third columns

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pandas data processing based on filtering specified rows or columns

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Pandas data processing based on filtering specified rows or columns

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support