I believe many people like me in the process of learning Python,pandas data selection and modification has a great deal of confusion (perhaps by the Matlab) impact ...
To this day finally completely figure out ...
Let's start with a data box manually.
Import NumPy as NP
import pandas as PD
DF = PD. Dataframe (Np.arange (0,60,2). Reshape (10,3), columns=list (' abc ')
DF is such a drop
So what are the three ways to choose the data?
First, when column name is already available in each row, a full column of data can be selected with DF [' a ']. If you know column names and index, and both are well entered, you can choose. loc
Df.loc[0, ' a ']
df.loc[0:3, [' A ', ' B ']]]
df.loc[[1, 5], [' B ', ' C ']]
Because we do not name the index here, so is dataframe automatically given, for the number 0-9
Second, if we think that column name is too long, the input is inconvenient, there or index is a series of time, more difficult to input, that can be selected. Iloc. I think I'm representing index, which is a better way to remember.
df.iloc[1,1]
df.iloc[0:3, [0,1]]
df.iloc[[0, 3, 5], 0:2]
Iloc allows us to select the data using the slice (slice) method for column.
Third,. IX is more powerful, which allows us to mix subscript and name selection. It can be said to cover all the previous usage. Basically change the front to Df.ix can be successful, but one thing, is
Df.ix [[...] 1..], [.. 2..]], 1 boxes must be unified, must be subscript or name, 2 box is the same. BTW, the 1 box is used to specify that the row,2 box is the specified column, and of course all of the above methods are this rule.
This is my current understanding.