Python array,list,dataframe Index Tile Operation July 19, 2016--smart wave document

Source: Internet
Author: User

Array,list,dataframe Index Tile Operation July 19, 2016--smart wave document

A simple discussion on list, one-dimensional, two-dimensional array,datafrme,loc, Iloc and IX

NumPy an array of indexes and tiles:
Starting with the most basic list index, let's start with a code and result:

a = [0,1,2,3,4,5,6,7,8,9]  a[:5:-1]   #step < 0,所以start = 9  a[0:5:-1]  #指定了start = 0  a[1::-1]   #step < 0,所以stop = 0  

Output:

[9, 8, 7, 6][][1, 0]

List slice, in "[]" There are generally two ":" Delimiter, Chinese meaning is [start: End: Step] In the above case, the step is 1 so the output of the data is reversed. No Assignment (start,stop) defaults to 0. Sep defaults to 1 and the value cannot be 0.

a[10:20]#前11-20个数a[:10:2]#前10个数,每两个取一个a[::5]#所有数,每5个取一个

Advanced operations in Python slices:
Analysis of the principle of slicing:
The list of slices, inside is called getitem,setitem,delitem , and slice functions. The slice function is also associated with the range () function.
The key passed to the slice is a special slice object. The object has an attribute that describes the orientation of the requested slice, the meaning of the slice, and a demonstration:

>>> List4 = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20][1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]>>> x = List4[1:10] #x = List4.__getitem__(slice(1,10,None))[2, 3, 4, 5, 6, 7, 8, 9, 10]>>> List4[1:5]=[100,111,122] #List4.setitem__(slice(1,3,None),100,111,122])[1, 100, 111, 122, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]>>> del List4[1:4] #List4.del__delitem__(slice(1,4,None))[1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]>>>

Boundary problem for slices:

s=[1,2,3,4]       # S 上界为 0 下界为 4s[-100:100]       #返回 [1,2,3,4] -100超出了上界,100超出了下界:等价于 s[0:4]s[-100:-200]      #返回 [] -100,-200均超出了上界,自动取上界:等价于s[0:0]s[100:200]        #返回 [] 100,200均超出了下界,自动取下界值:等价于s[4:4]s[:100]           #返回 [1,2,3,4] 开始值省略表示从第0个开始s[0:]             #返回 [1,2,3,4] 结束值为空表示到最后一个结束  

Knowledge of slicing extensions:

>>> id(List4)140115516658320#直接通过列表来赋值 List5 = List4,指向的内存地址空间是不变的,都是(140115516658320),无论删除List4还是List5这个列表都会被删除,即List4和List5都没有元素了。>>> List5 = List4>>> List5[1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]>>> List4[1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]>>> id(List5)140115516658320#但是,通过切片来命名的两个列表他们指向的内存地址编号是不同的,140115516658320 !=  140115516604784>>> List6 = List5>>> id(List6)140115516658320>>> List6 = List4[:]>>> id(List6)140115516604784>>> #地址改变... >>>

The following additions are made to the extensions:

  >>> listofrows = [[1,2,3,4], [5,6,7,8], [9,10,11,12]]>>> li = listofrows>>> ID ( listofrows) 206368904l>>> ID (LI) #两者id一致, referencing the same object 206368904l>>> listofrows[:] = [[Row[0], row[3], row [2]] for row in Listofrows]>>> listofrows[[1, 4, 3], [5, 8, 7], [9,, 11]]>>> Li #使用切片赋值, to achieve the desired effect, the same object with with changes [[1, 4, 3], [5, 8, 7], [9, +, 11]]>>> ID (listofrows) 206368904l>>> ID (LI) #两者的id都没有变化, description of the slice assignment is modified on the original object 206368904l>>> listofrows = [[1,2,3,4], [5,6,7,8], [9,10,11,12]]>>> li[[1, 4, 3], [5, 8, 7], [9, 12, 11]] >>> ID (LI) #li没有改变206368904L >>> ID (listofrows) #两者id不同, stating that Listofrows binds a new object 206412488l>> > Listofrows[[1, 2, 3, 4], [5, 6, 7, 8], [9, Ten, One,]]  

If you use "listofrows =" directly, a new object is created, using "listofrows[:] =" notation. Simply put, using a slice assignment modifies the class tolerance of the original object instead of creating a new object.
A sequence (consequence) is a data structure in Python that gets the objects in the sequence based on the index.
Python contains six kinds of built-in sequence classes:list, tuple, string, Unicode, buffer, xrange. Where Xrange is special, it is a generator, and several other types have some sequence attributes that are not suitable for it. In general, data types with a sequence structure can be used: index, Len, Max, Min, in, +, *, slice.
A list slice is called a step slice, allowing the third element to be sliced with its syntax sequence[start index: End index: Step value]. The formula is: "Gu Tou regardless of the tail." If your first index is "0", then you can omit to write.
When Python uses slice syntax, it produces slice objects. Extended slice syntax allows for different index tile operations to include step slices, multidimensional slices, and omitted slices. The syntax for a multidimensional slice is sequence[start1:end1,start2:end2], or use the ellipsis, Sequence[...,start1:end1]. The slice object can also be slice () by the built-in function.

Selection of two-dimensional arrays:
First we said that the syntax for multidimensional array slices is sequence[start1:end1,start2:end2,..., Startn:endn] We use a 3x3 two-dimensional array to illustrate the selection problem:

>>> b  = np.arange(9).reshape(3,3)>>> barray([[0, 1, 2],       [3, 4, 5],       [6, 7, 8]])

Array subscript is starting from 0, for array A, you only need to use A[m,n] to select the elements in each array. The corresponding location is as follows

[(0,0),(0,1),(0,2)][(1,0),(1,1),(1,2)][(2,0),(2,1),(2,2)]

For the slice two-dimensional syntax is sequence[start1:end1,start2:end2]

>>> b[1:,:2]#先从第一个逗号分割输出从1开使行 就是  [(1,0),(1,1),(1,2)]# 和 [(2,0),(2,1),(2,2)]#拿第一个逗号分割的数据,在进行第二维操作,到2结束的列,输入如下array([[3, 4],       [6, 7]])

Based on the understanding of stepping slices, the two-and three-dimensional are equally well understood and not as complicated as stepping
You can also copy the elements of a slice

>>> b[1:,:2] = 1 #广播赋值>>> barray([[0, 1, 2],       [1, 1, 5],       [1, 1, 8]])>>> b[1:,:2].shape(2L, 2L)>>> b[1:,:2] = np.arange(2,6).reshape(2,2) #对应赋值>>> barray([[0, 1, 2],       [2, 3, 5],       [4, 5, 8]])

Three-dimensional, the same is sequence[start1:end1,start2:end2]. When you take a single value, A[l,m,n].
The omitted representation [:] takes all elements of the nth dimension.

>>> b=np.arange(24).reshape(2,3,4)>>> b[1,]array([[12, 13, 14, 15],       [16, 17, 18, 19],       [20, 21, 22, 23]])>>> b[1,2]array([20, 21, 22, 23])>>> b[1,2,3]23>>> b[1,:,3]array([15, 19, 23])>>>  

Here, for pandas's dataframe we can use the iloc to slice a DF as a multidimensional array

>>> b  = np.arange(9).reshape(3,3)>>> df = pd.DataFrame(b)>>> df.iloc[1,2]5>>> df.iloc[1:,2]1    52    8Name: 2, dtype: int32>>> df.iloc[1:,:2]   0  11  3  42  6  7>>> df.iloc[1:,:2] = 1#同样的广播赋值>>> df   0  1  20  0  1  21  1  1  52  1  1  8

(Mom doesn't have to worry about my df slice)

Let's talk about the selection of Loc,loc based on index and columns, which is recommended in DF assignment operation.

When index and columns are numeric and starting from 0 we compare:

>>> b = np.arange(36).reshape(6,6)>>> barray([[ 0,  1,  2,  3,  4,  5],       [ 6,  7,  8,  9, 10, 11],       [12, 13, 14, 15, 16, 17],       [18, 19, 20, 21, 22, 23],       [24, 25, 26, 27, 28, 29],       [30, 31, 32, 33, 34, 35]])>>> df = pd.DataFrame(b)>>> df    0   1   2   3   4   50   0   1   2   3   4   51   6   7   8   9  10  112  12  13  14  15  16  173  18  19  20  21  22  234  24  25  26  27  28  295  30  31  32  33  34  35>>> df.loc[1:,:2]    0   1   21   6   7   82  12  13  143  18  19  204  24  25  265  30  31  32>>> df.iloc[1:,:2]    0   11   6   72  12  133  18  194  24  255  30  

You can see Df.loc[1:,:2] chose the contents of column 2nd, but his nature is not range (0,2) he included the end of the 2. He's actually a >= relationship. The column is judged to be greater than or equal to 2. Terminates immediately after the condition is not met.

>>> df.columns  =  [2,1,3,4,0,5]>>> df    2   1   3   4   0   50   0   1   2   3   4   51   6   7   8   9  10  112  12  13  14  15  16  173  18  19  20  21  22  234  24  25  26  27  28  295  30  31  32  33  34  35>>> df.loc[1,2]6>>> df.iloc[1,2]8>>> >>> df.iloc[1:,:2]    2   11   6   72  12  133  18  194  24  255  30  31>>> df.loc[1:,:2]    21   62  123  184  245  

One advantage of Loc is that you can rearrange the order of the column

>>> df.loc[:,(1,2,3,4)]    1   2   3   40   1   0   2   31   7   6   8   92  13  12  14  153  19  18  20  214  25  24  26  275  31  30  32  33>>> df.iloc[:,(1,2,3,4)]    1   3   4   00   1   2   3   41   7   8   9  102  13  14  15  163  19  20  21  224  25  26  27  285  31  32  33  

Very magical, this iloc is not easy to do, when the column name for the letter, Loc can be fanciful.

IX solving the problem of mixed selection

>>> df.ix[:,(1,2,3,4)]    1   2   3   40   1   0   2   31   7   6   8   92  13  12  14  153  19  18  20  214  25  24  26  275  31  30  32  33>>> df.ix[:,:2]    20   01   62  123  184  245  30>>> df.ix[:,:2]

IX simple understanding is that when the ranks are numbers, ix with LOC. IX automatically determines the value in [] if it is all letters, but "row, column" does not change

>>> df.loc[:2,:2]   22  0>>> df.iloc[:2,:2]   2  12  0  11  6  7>>> df.ix[:2,:2]   22  0>>> df.index  =  [‘a‘,‘c‘,‘d‘,‘b‘,‘e‘,‘f‘]>>> df.ix[:2,:2]   2a  0c  6>>> df.iloc[:2,:2]   2  1a  0  1c  6  7>>> df.loc[:2,:2]  #这里loc就报错了,因为column里面没有数值类型的 Traceback (most recent call last):

That's a lot to be understood about.

Python array,list,dataframe Index Tile Operation July 19, 2016--smart wave document

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.