/** from the beginning to take over the development of big data, in many ways is very clumsy,
Simply remember the project experience of working on big data
*/
Sqoop:
For operations such as correlation between relational data and big data data
First article:
1: Data import into the Big data cluster environment
A: First communication to pass (nonsense ...)
Connect database commands in this way (oacle10g, sqoop1.4.5-cdh5.2.0)
Sqoop import--connect "jdbc:oracle:thin:@134.64.**.**:1521:****"--username use--password pwd
Driver, IP address, port, user, password------Nothing to say, pay attention to the quotation marks!!!
Sqoop Import Command--import
Business Scenario:
1: Full table Import to HDFs
2: Partial fields are imported into HDFs
3: Where to determine partial fields import into HDFs
4: In Chinese (LINUX, Hadoop encoded format)
5: Large amount of data distribution processing
6: Processing of split characters
7: Compression processing
8: Character Conversion
Begin
Hadoop fs-mkdir/user/***/testsqoop//Create a folder on a clustered HDFS environment testsqoop the data used to store the test
Now full table import attempt:
Sqoop import--connect "jdbc:oracle:thin:@134. **.**.**:1521:* * * * *--username * * * *-- Table Zqk_bigdata_test_sqoop--target-dir/user/***/testsqoop001
Mr Finishes, the output is stored in the directory under folder name testsqoop001
However, the fields that are not needed are directed into, so use the----query parameter to write the SELECT statement, you need to add the "where $CONDITIONS", and the split-by parameter
Sqoop import--connect "jdbc:oracle:thin:@134. **.**.**:1521:* * *"--username * *--password ***-- Query 'select t.nom1,t.nom2,t.nom3 from Zqk_bigdata_test_sqoop t where $CONDITIONS' --target-dir/user/***/testsqoop002 --split-by nom2 //split-by * * PRIMARY KEY * *, no primary key can also be used in place of a suitable field
Then with the WHERE condition, try---you need to use the where parameter to specify the query condition to use when exporting, success
Sqoop import--connect "jdbc:oracle:thin:@134.64. ***.**:1521:* * *"--username ***--password * * Query 'select t.nom1,t.nom2,t.nom3 from Zqk_bigdata_test_sqoop t where $CONDITIONS' --target-dir/user/****/testsqoop003 --split-by nom2--where "t.nom2<'4'"
Then there's the split symbol, and when the field in Oracle is empty, you'll find that by default it's "null".
But I want to show as "", empty instead of a null string----need a parameter--null-string ', what's in it?
Sqoop import--connect "jdbc:oracle:thin:@134. **.**.**:1521:* * *"--username * * * *--password-- Query 'select t.nom1,t.nom2,t.nom3 from Zqk_bigdata_test_sqoop t where $CONDITIONS' --target-dir/user/***/testsqoop003 --split-by nom2--where "t.nom2<'4'" --null-string"
--null-non-string When that is a non-string type, use this syntax
Sqoop use of the experience <01>