A simple time series data set is constructed to illustrate the indexing function.

Last Update:2016-12-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The data index and the selection of axis label information in the Pandas object have many effects:

Use known indications to determine data (that is, providing metadata), which is important for analysis, visualization, and display of the interactive console to enable automatic and explicit data alignment allows you to intuitively get and set a subset of datasets in this part, we will devote ourselves to the ultimate purpose: how to slice, Dice and generally get and set a subset of the Pandas object. The articles will be focused on series and dataframe, as they have great potential. It is hoped that more effort will be devoted to high-dimensional data structures (including panel) in the future, especially in the context of advanced label-based indexing.

Tip: Index operations for Python and bumpy [] and property operations. Provides a very quick and easy way to pandas data structures. If you already know how to manipulate Python dictionaries and numpy arrays, there's nothing new. However, because the type of data cannot be predicted in advance, there are some optimization limitations to using standard operations directly. For the product code, we recommend that you take advantage of the optimized pandas data usage shown in this article.

Warning: Whether a set operation returns a copy or a reference may depend on the situation. This is sometimes referred to as "chained assignment", and we should avoid this situation.

Warning: In version 0.15.0, like other Pandas objects, index is no longer a subclass of Ndarray, but a subclass of Pandasobject. This has little effect.

A variety of indexing methods

To achieve a simpler location-based index, the object selection method adds some user requests. Pandas now supports three types of multi-axis indexes.

. Loc is the most basic label-based index, but it can also be used with Boolean arrays. When item cannot be found,. LOC will produce keyerror. Legal inputs are: a single label, such as 5 or "a", (note that 5 is indexed as an index label, not an integer position index) a list or array label ["A", "B", "C"] a slice object with the label "a": "F" (note that, in contrast to the Python slice, The first and last of these slices are contained inside! A Boolean array is a callable function (called series, Dataframe or panel) and returns the valid output of the index (one of the above). Iloc is the most basic integer-based index (from the No. 0 bit of the axis to the length-1 bit), but it can also be used for Boolean arrays. In addition to indexers that allow a hyper-scoped index, if a requested index is outside the index range, the. Iloc will produce indexerror. The legal input is: an integer. Such as 51 lists or arrays of integers. such as [3,0,4] An Integer slice object, such as 1:71 Boolean array a callable function (call series, dataframe or panel) and return the valid output of the index (one of the above). IX supports mixed indexes based on integers and labels. It is primarily label-based, but unless the corresponding axis is an integer type, it will return to the integer location for access.. IX is the most pervasive and can support. Loc and. Iloc any input: IX also supports floating-point labels. When processing a hierarchical index that is based on a mix of locations and labels. IX is particularly useful. However, an integer-based axis supports label-based indexing only, and does not support location-based indexing. Therefore, in such cases, the use of. Iloc or. Loc will usually

More explicit.

. Loc,. Iloc,. IX, and [] indexes can accept a callable object as an indexer. Use the following tags to get values from a multi-axis object (using. Loc For example, but also for. Iloc and. ix). Any axis accessor may be an empty slice: the axis is assumed to be nonstandard. (e.g. p.loc[' a '] equivalent to p.loc[' a ',:,:]) Object Type indexers Series s.loc[indexer] DataFrame df.loc[row_indexer,column_indexer ] Panel p.loc[item_indexer,major_indexer,minor_indexer basic knowledge as mentioned in the data structure in the previous section, the main function of indexing using [] is the equivalent of __ in Python GETITEM__) is the selection of a low-dimensional slice. Therefore, the object type picks the return value type series Series[label] scalar value DataFrame frame[colname] corresponding to the colname Series Panel Panel[itemname] corresponding to the ItemName dat Aframe

Here we build a simple time series data set to illustrate the indexing function:

In [1]: Dates = pd.date_range (' 1/1/2000 ', periods=8) in [2]: df = PD. DataFrame (NP.RANDOM.RANDN (8, 4), index=dates, columns=[' A ', ' B ', ' C ', ' D ']) in [3]: DF out[3]: A B C D 2000-01-01 0.469112 -0.282863-1.509059-1.135632 2000-01-02 1.212112-0.173215 0.119209-1.044236 2000-01-03-0.861849-2.104569-0.494929 1 .071804 2000-01-04 0.721555-0.706771-1.039575 0.271860 2000-01-05-0.424972 0.567020 0.276232-1.087401 2000-01-06-0.67 3690 0.113648-1.478427 0.524988 2000-01-07 0.404705 0.577046-1.715002-1.039268 2000-01-08-0.370647-1.157892-1.344312 0.844885 in [4]: panel = PD. Panel ({' One ': DF, ' II ': Df-df.mean ()}) in [5]: Panel out[5]: <class ' Pandas.core.panel.Panel ' > Dimensions:2 (i tems) x 8 (Major_axis) x 4 (Minor_axis) Items axis:one to, Major_axis axis:2000-01-01 00:00:00 to 2000-01-08 00:00:00 Minor_axis axis:a to D Note: Unless otherwise specified, all indexing functions are generic and not only applicable to the time series. Therefore, according to the above, we use [] to achieve the most basic index: in [6]: s = df[' A ' "in [7]: S[dates[5]] out[7]: -0.67368970808837059 in [8]: panel[' out[8]: A B C D 2000-01-01 0.409571 0.113086-0.610826-0.936507 2000-01-02 1.152571 0.222735 1.017442-0.845111 20 00-01-03-0.921390-1.708620 0.403304 1.270929 2000-01-04 0.662014-0.310822-0.141342 0.470985 2000-01-05-0.484513 0.962 970 1.174465-0.888276 2000-01-06-0.733231 0.509598-0.580194 0.724113 2000-01-07 0.345164 0.972995-0.816769-0.840143 2 000-01-08-0.430188-0.761943-0.446079 1.044010 You can select multiple columns in order by passing a list of columns to []. If a column is no longer dataframe, an exception is thrown. You can also set multiple columns in this way. In [9]: DF out[9]: A B C D 2000-01-01 0.469112-0.282863-1.509059-1.135632 2000-01-02 1.212112-0.173215 0.119209-1.044 236 2000-01-03-0.8618492881064151-2.104569-0.494929 1.071804 2000-01-04 0.721555-0.706771-1.039575 0.271860 2000-01- 05-0.424972 0.567020 0.276232-1.087401 2000-01-06-0.673690 0.113648-1.478427 0.524988 2000-01-07 0.404705 0.577046-1. 715002-1.039268 2000-01-08-0.370647-1.157892-1.344312 0.844885 in [ten]: df[[' B ', ' a ']] = df[[' A ', ' B ']] #交换两个列的值 in [1 1]: DF out[11]: A B C D 2000-01-01-0.282863 0.469112-1.509059-1.135632 2000-01-02-0.173215 1.212112 0.119209-1.044236 2000-01-03-2.104569-0.861849- 0.494929 1.071804 2000-01-04-0.706771 0.721555-1.039575 0.271860 2000-01-05 0.567020-0.424972 0.276232-1.087401 2000-0 1-06 0.113648-0.673690-1.478427 0.524988 2000-01-07 0.577046 0.404705-1.715002-1.039268 2000-01-08-1.157892-0.370647 -1.344312 0.844885

When you apply this transformation to a subset of columns in place, you may find the usefulness of this method.

Warning: When you set series and Dataframe from. Loc,. Iloc, and. IX, pandas aligns all axes.

This does not change the DF because the column alignment is done before the value is assigned.

in [[]: df[[' A ', ' B ']] out[12]: A B 2000-01-01-0.282863 0.469112 2000-01-02-0.173215 1.212112 2000-01-03-2.104569-0.8 61849 2000-01-04-0.706771 0.721555 2000-01-05 0.567020-0.424972 2000-01-06 0.113648-0.673690 2000-01-07 0.577046 0.4047 2000-01-08-1.157892-0.370647 in [+]: df.loc[:,[' B ', ' a ']] = df[[' A ', ' B ']] #这种方法无法使列A和列B的值对调 in []: df[[' A ', ' B ']] OUT[14]

A simple time series data set is constructed to illustrate the indexing function.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A simple time series data set is constructed to illustrate the indexing function.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A simple time series data set is constructed to illustrate the indexing function.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support