Transferred from: http://blog.csdn.net/wangying19911991/article/details/73928172
https://www.zhihu.com/question/58993137
How exactly is axis in python defined? Do they represent dataframe rows or columns? Consider the following code:
>>>df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], columns=["col1", "col2", "col3", "col4"])>>>df col1 col2 col3 col4 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3
If we call Df.mean (Axis=1), we will get the mean value computed by row
>>> df.mean(axis=1)0 11 22 3
However, if we call Df.drop (name, Axis=1), we actually delete a column instead of a line:
>>> df.drop("col4", axis=1) col1 col2 col30 1 1 11 2 2 22 3 3 3
Can someone help me understand what's meant by a "axis" in pandas/numpy/scipy?
Can someone help me understand the true meaning of axis parameters in pandas, NumPy, scipy san du?
The highest-voting answer reveals the nature of the problem:
In fact, the problem is that axis has a problem, and Df.mean actually takes the mean of all columns on each row, rather than preserving the mean value of each column. It may be simple to remember that the axis=0 represents a cross-line (down), while Axis=1 represents a cross-column (across), as an adverb of the method action (translator's note)
Other words:
- Use a value of 0 to perform a method downward along each column or row label \ Index value
- Use a value of 1 to indicate that the corresponding method is executed along each row or column label
Represents the meaning of axis 0 and 1 o'clock respectively in Dataframe:
Axis parametric action Direction diagram
In addition, keep in mind that pandas maintains NumPy's use of the keyword axis, and the usage is explained in the NumPy Library Glossary:
Axes are used to define properties for arrays that are more than one dimension, and two-dimensional data has two axes: the No. 0 axis goes down the line vertically, and the 1th axis extends horizontally along the column's direction.
So the first column in the question, Df.mean (Axis=1), represents the mean in the horizontal direction of the column, and the second column, Df.drop (name, Axis=1), represents the column labels that correspond to the name (we) are deleted in the horizontal direction.
Python's NumPy (axis=0 and Axis=1) distinguish