relational database import and export trick baked

Source: Internet
Author: User
Tags sqoop

Sqoop as a tool for data transmission, for Hadoop with the data transfer between the traditional database plays a bridge, so how to import and export data?

First: Use MapReduce job to perform the import:

( 1 ) Sqoop first check the table that will be imported

1 , determines the primary key (if any), invokes the MapReduce , split by primary key Map

2 , no primary key, run boundary query determines the number of records imported (find a min and a Max to determine the boundaries to be divided)

3 , the results of boundary query are divided by the number of tasks . , so they have the same load

( 2 ) Sqoop generate for each table that will be imported Java source File

1 , compiling and using files during the import process

2 , still retained after import, can be safely deleted

Second: Use Sqoop To import the entire database:

(1) Import-all-tables tools to import the entire database

1 , a file stored as a comma interval

2 , import by default to HDFS of the Home Catalogue

3 , the data will be placed in subdirectories of each table

650) this.width=650; "Src=" Http://s2.51cto.com/wyfs02/M02/8B/BF/wKiom1hXdT7iH2INAAAShqPt8jo536.png-wh_500x0-wm_3 -wmp_4-s_3191496885.png "title=" 11.png "alt=" Wkiom1hxdt7ih2inaaashqpt8jo536.png-wh_50 "/>

( 2 ) using --warehouse-dir option to specify a different base directory 650) this.width=650; "Src=" http://s1.51cto.com/wyfs02/M00/8B/BF/ Wkiom1hxdvpcb-yqaaavy2vggnc378.png-wh_500x0-wm_3-wmp_4-s_4141441691.png "title=" 22.png "alt=" Wkiom1hxdvpcb-yqaaavy2vggnc378.png-wh_50 "/>

Third: Use Sqoop To import a single table:

( 1 ) Import tools to import a single table

1 , example: Import accounts Table

store data in a comma-delimited manner to HDFS

650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M00/8B/BB/wKioL1hXdWeh-vJzAAASa1rV2Lo830.png-wh_500x0-wm_3 -wmp_4-s_1012766396.png "title=" 33.png "alt=" Wkiol1hxdweh-vjzaaasa1rv2lo830.png-wh_50 "/>

2 , specify Tab Delimited fields

650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M02/8B/BB/wKioL1hXdXbhOTNYAAAV-q0G1GU170.png-wh_500x0-wm_3 -wmp_4-s_1975346635.png "title=" 44.png "alt=" Wkiol1hxdxbhotnyaaav-q0g1gu170.png-wh_50 "/>

IV: Incremental Import

(1) What if a change occurred after the last record was imported?

1 , all records can be re-imported, but inefficient

(2) Sqoop of the lastmodified Incremental mode imports new and modified records

1 , based on the specified timestamp column

2 , ensuring that when records are updated or added timestamp also updated

650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M01/8B/BF/wKiom1hXdYfRTTZTAADcbxJuKiM539.png-wh_500x0-wm_3 -wmp_4-s_652132674.png "title=" 55.png "alt=" Wkiom1hxdyfrttztaadcbxjukim539.png-wh_50 "/>

(3) Append Incremental mode only imports new records

1 , based on the last record of the specified column

650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M02/8B/BB/wKioL1hXdZeSBMMwAADIokr7ixk722.png-wh_500x0-wm_3 -wmp_4-s_1193503029.png "title=" 66.png "alt=" Wkiol1hxdzesbmmwaadiokr7ixk722.png-wh_50 "/>

Fifth: Use Sqoop from Hadoop Export Data to RDBMS

( 1 ) Sqoop of the Import tool to move data from RDBMS pull in to HDFS

( 2 ) Sometimes you need to put HDFS The data is pushed back to RDBMS , when a large data set needs to be batched and the results are exported to RDBMS for other systems to access

( 3 ) Sqoop Use Export Tools , RDBMS The table must already exist before exporting

650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M00/8B/BF/wKiom1hXdaWRjKDDAADrmhoLyKM720.png-wh_500x0-wm_3 -wmp_4-s_1597949081.png "title=" 77.png "alt=" Wkiom1hxdawrjkddaadrmholykm720.png-wh_50 "/>

Master the above way, for the import and export of relational database basically have a clear cognition and understanding, in fact, we in the daily study and practice process to look at others to share, after all, do technology each person's experience and experience is not the same, sometimes don't know will have intention not to harvest. I usually like to pay attention to "big data cn", "Big Data Era Learning Center" These service numbers, for me personally, played a great role in promoting, but also look forward to every one to learn big data can have a harvest!


This article is from the "11872756" blog, please be sure to keep this source http://11882756.blog.51cto.com/11872756/1883944

relational database import and export trick baked

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.