Pandas data processing based on filtering specified rows or columns

Source: Internet
Author: User
This article mainly introduces the pandas data processing basis to filter the specified row or the specified column of the relevant information, the need for friends can refer to the following

The main two data structures of Pandas are: series (equivalent to one row or column of data bodies) and dataframe (a tabular data body equivalent to multiple rows and columns).

This article is intended to facilitate understanding of the associative analogy with Excel or SQL operations rows or columns

1. Re-index: Reindex and IX

The default row index after the data read is described in the previous article is 0,1,2,3 ... this sequence number. The column index is equivalent to the field name (that is, the first row of data), where re-indexing means that the default index can be re-modified to look the way you want.

1.1 Series

For example: data=series ([4,5,6],index=[' A ', ' B ', ' C '), row index is a,b,c.

We use Data.reindex ([' A ', ' C ', ' d ', ' e ']) to modify the index and then output:

It can be understood that we set the index with Reindex, according to the index to the original data match the corresponding value, no match is Nan.

1.2 DataFrame

(1) row index modification: Dataframe row index Same series

(2) Column index modification: Carnaby references Reindex (columns=[' M1 ', ' m2 ', ' m3 '), and uses the parameter columns to specify the modification of the column index. Modifying a logical similar row index is equivalent to using a new column index to match the original data, not matching the set Nan

Cases:

(3) Simultaneous modification of row and column indexes is possible with

2. Discard columns on the specified axis (popular parlance is to delete rows or columns):d ROP

Select by index to delete which row or column

data.drop(['a','c']) 相当于delete table a where xid='a' or xid='c'

data.drop('m1',axis=1)相当于delete table a where yid='m1'

3. Select and filter (in layman's terms filter queries by criteria in SQL)

Because there are row and column indexes in Python, it is more convenient to do the data filtering

3.1 Series

(1) Select by row index as

Obj[' B ' is equivalent select * from tb where xid='b'obj['b','a','c'] select * from tb where xid in ('a','b','c') , and the results are shown in the order of B, A, C, which is the difference from SQL Obj[0:1] and obj[' a ': ' B ') as follows:

#前者是不包含末端, the latter is contained in the end

(2) Filtering by the size of the value obj[obj>-0.6] is equivalent to finding a record with a value greater than 0.6 in the obj data to show

3.2 DataFrame

(1) Select single line with IX or xs:

For example, the row record that filters index B is in the following three ways

(2) Select multiple lines:

How to filter two rows of records indexed to a, b

#以上不能直接写成data [[' A ', ' B ']]

Data[0:2] represents the record from the first row to the second row. The first line defaults from 0, and does not contain the end of 2.

(3) Select single column

Filter all row record data for M1 columns

(4) Select multiple columns

Filter m1,m3 two columns, all rows of recorded data

Ix[:,[' M1 ', ' m2 ']] before: Indicates that all rows are filtered in.

(5) Filter rows or columns based on the size criteria of the values

For example, filtering out all records with a column value greater than 4 is equivalent to the SELECT * from TB where column name >4

(6) If you filter all records with a column value greater than 4, and you only need to show some of the columns

Rows are filtered by criteria, and columns use [0,2] to filter data for the first and third columns

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.