MySQL migration tool to hive/hbase

Source: Internet
Author: User
Apache Hive is currently one of the first products for free for large data warehouses. People who use Apache Hive do not expect any articles on small data volumes, for example, if the data in MySQL is moved to Hive/HBase, then the SQL statement that can be executed quickly is estimated to be more than 10 times longer than the original time. However, if you have MySQL Data, you can import a large amount of data into Hive. If you have hundreds of millions of data records plus complex SQL query entries, It is a headache for MySQL, at this time, it is relatively easy for Hive, but there is no communication bridge between the two.

Alibaba's great cloud computing company cloudera.com is also a powerful supporter of Hadoop. Sqoop, as its name suggests, SQL-to-Hadoop, abstracts various database types through the ManagerFactory abstract class in Sqoop, data in databases such as Hsqldb, MySQL, Oracle, and PostgreSQL can be written to Hive.

You can export/import all data by using one command, and filter tables and data. The efficiency of development and the simplicity of configuration are characteristic of this tool, the same machine configuration, machine quantity, data volume, and data content, but different environments have different execution efficiency. by migrating RMDBS to Hadoop, the performance has been improved, so it reflects the value of sqoop.

Main Sqoop functions mentioned at a Development Conference
JDBC-based implementation
Works with your popular database vendors
Auto-generation of tedious user-side code
Write mapreduce applications to work with your data, faster
Integration with hive
Allows you to stay in a SQL-based environment
Extensible backend
Database-specific code paths for better performance

Detailed operation manual:
Http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html (official)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.