Pandas data merging and remodeling (Concat join/merge)

Source: Internet
Author: User
1 concat
The Concat function is a method underneath the pandas that allows for a simple fusion of data based on different axes.
Pd.concat (Objs, axis=0, join= ' outer ', Join_axes=none, Ignore_index=false, Keys=none, Levels=none, Names=None,
       Verify_integrity=false)
1 2 1 2 1 2

Parameter description
Objs:series,dataframe or a sequence of panel compositions lsit
Axis: Axis that needs to merge links, 0 is row, 1 is column
Join: Connecting the way inner, or outer

Some other parameters are not commonly used, when used to fill in the description. 1.1 table with same field

# The table is now made into a list, then the input as Concat in
[4]: frames = [Df1, DF2, df3] in

[5]: result = Pd.concat (frames)
1 2 3 4 1 2 3 4 1 2 3-4

You can add key parameters to identify which table the data originated from by adding a level key to the phase.

In [6]: result = Pd.concat (frames, keys=[' x ', ' y ', ' z '])
1 1 1

The effect is as follows

1.2 Transverse table stitching (line alignment) 1.2.1 Axis

When axis = 1, concat is line aligned and then merges two tables of different column names

In [9]: result = Pd.concat ([Df1, Df4], Axis=1)
1 1 1

1.2.2 Join

Plus the attribute of the join parameter, if ' inner ' gets the intersection of the two tables, and if it is a outer, the combination of the two tables is obtained.

In [ten]: result = Pd.concat ([Df1, Df4], Axis=1, join= ' inner ')
1 1 1

1.2.3 Join_axes

If you have join_axes arguments passed in, you can specify the alignment of the data according to that axis
For example, if the data is aligned according to the DF1 table, the axis of the specified DF1 table is preserved, and then the DF4 table is spliced

In [one]: result = Pd.concat ([Df1, Df4], Axis=1, Join_axes=[df1.index])
1 1 1

1.3 Append

Append is the series and Dataframe method, which is used by default along the columns (axis = 0, column alignment)
in [[]: result = Df1.append (DF2)
1 1 1

1.4 Ignoring the concat of index

If the index of the two tables has no actual meaning, use the Ignore_index argument, set True, and the merged two tables will sleep. Aligns according to the column fields, and then merges. Finally, reorganize a new index.
1.5 combine to increase the number of keys that distinguish a data group

The keys mentioned above can be used to add key to the merged table to differentiate different table data sources 1.5.1 can be implemented directly with key parameters

In [to]: result = Pd.concat (frames, keys=[' x ', ' y ', ' z '])
1 1 1

1.5.2 incoming dictionaries to increase the grouping keys

in [n]: pieces = {' X ': df1, ' y ': df2, ' z ': df3} in

[[]: result = Pd.concat (pieces)
1 2 3 1 2 3 1 2 3

1.6 Add a new line to the Dataframe

The Append method inserts the series and dictionary data as a new line of Dataframe.

In [%]: S2 = pd. Series ([' X0 ', ' X1 ', ' X2 ', ' X3 '], index=[' A ', ' B ', ' C ', ' D ']) in

[?]: result = df1.append (s2, ignore_index=true)
1 2 3 1 2 3 1 2 3 Table column fields different tables merge
If you encounter a column field that is not the same as two tables, but you want to combine two tables, the invalid values are expressed in Nan. Then you can use Ignore_index to implement it.

in [+]: dicts = [{' A ': 1, ' B ': 2, ' C ': 3, ' X ': 4},
   ...:          {' A ': 5, ' B ': 6, ' C ': 7, ' Y ': 8}]
   ...: in 

[37]: result = Df1.append (dicts, Ignore_index=true)


Merge

The pandas Merge method provides a SQL-like memory link operation, which the official website document mentions to be more efficient than other Open-source language data operations such as R.

A comparison of SQL statements can be seen here

The parameters of the merge

On: Column name, the name of the column to align to, and use this parameter to ensure that the column that is aligned with the left and right tables has the same column name.

LEFT_ON: Left table aligned columns, can be column names, can also be the same length as dataframe a

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.