Pandas data merging and remodeling (Concat join/merge)

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1 concat

The Concat function is a method underneath the pandas that allows for a simple fusion of data based on different axes.

Pd.concat (Objs, axis=0, join= ' outer ', Join_axes=none, Ignore_index=false, Keys=none, Levels=none, Names=None,
       Verify_integrity=false)

1 2 1 2 1 2

Parameter description
Objs:series,dataframe or a sequence of panel compositions lsit
Axis: Axis that needs to merge links, 0 is row, 1 is column
Join: Connecting the way inner, or outer

Some other parameters are not commonly used, when used to fill in the description. 1.1 table with same field

# The table is now made into a list, then the input as Concat in
[4]: frames = [Df1, DF2, df3] in

[5]: result = Pd.concat (frames)

1 2 3 4 1 2 3 4 1 2 3-4

You can add key parameters to identify which table the data originated from by adding a level key to the phase.

In [6]: result = Pd.concat (frames, keys=[' x ', ' y ', ' z '])

1 1 1

The effect is as follows

1.2 Transverse table stitching (line alignment) 1.2.1 Axis

When axis = 1, concat is line aligned and then merges two tables of different column names

In [9]: result = Pd.concat ([Df1, Df4], Axis=1)

1 1 1

1.2.2 Join

Plus the attribute of the join parameter, if ' inner ' gets the intersection of the two tables, and if it is a outer, the combination of the two tables is obtained.

In [ten]: result = Pd.concat ([Df1, Df4], Axis=1, join= ' inner ')

1 1 1

1.2.3 Join_axes

If you have join_axes arguments passed in, you can specify the alignment of the data according to that axis
For example, if the data is aligned according to the DF1 table, the axis of the specified DF1 table is preserved, and then the DF4 table is spliced

In [one]: result = Pd.concat ([Df1, Df4], Axis=1, Join_axes=[df1.index])

1 1 1

1.3 Append

Append is the series and Dataframe method, which is used by default along the columns (axis = 0, column alignment)

in [[]: result = Df1.append (DF2)

1 1 1

1.4 Ignoring the concat of index

If the index of the two tables has no actual meaning, use the Ignore_index argument, set True, and the merged two tables will sleep. Aligns according to the column fields, and then merges. Finally, reorganize a new index.
1.5 combine to increase the number of keys that distinguish a data group

The keys mentioned above can be used to add key to the merged table to differentiate different table data sources 1.5.1 can be implemented directly with key parameters

In [to]: result = Pd.concat (frames, keys=[' x ', ' y ', ' z '])

1 1 1

1.5.2 incoming dictionaries to increase the grouping keys

in [n]: pieces = {' X ': df1, ' y ': df2, ' z ': df3} in

[[]: result = Pd.concat (pieces)

1 2 3 1 2 3 1 2 3

1.6 Add a new line to the Dataframe

The Append method inserts the series and dictionary data as a new line of Dataframe.

In [%]: S2 = pd. Series ([' X0 ', ' X1 ', ' X2 ', ' X3 '], index=[' A ', ' B ', ' C ', ' D ']) in

[?]: result = df1.append (s2, ignore_index=true)

1 2 3 1 2 3 1 2 3 Table column fields different tables merge

If you encounter a column field that is not the same as two tables, but you want to combine two tables, the invalid values are expressed in Nan. Then you can use Ignore_index to implement it.

in [+]: dicts = [{' A ': 1, ' B ': 2, ' C ': 3, ' X ': 4},
   ...:          {' A ': 5, ' B ': 6, ' C ': 7, ' Y ': 8}]
   ...: in 

[37]: result = Df1.append (dicts, Ignore_index=true)

Merge

The pandas Merge method provides a SQL-like memory link operation, which the official website document mentions to be more efficient than other Open-source language data operations such as R.

A comparison of SQL statements can be seen here

The parameters of the merge

On: Column name, the name of the column to align to, and use this parameter to ensure that the column that is aligned with the left and right tables has the same column name.

LEFT_ON: Left table aligned columns, can be column names, can also be the same length as dataframe a

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Pandas data merging and remodeling (Concat join/merge)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Pandas data merging and remodeling (Concat join/merge)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support