MySQL releases applier to replicate data to hadoop in real time

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The MySQL replication operation can copy data from one MySQL Server (master) to one or more other MySQL servers (slave ). Imagine,If the slave server is no longer limited to a MySQL server, but any other database server or platform, and the replication event requires real-time execution, can it be implemented?

The latest MySQL applier for hadoop (hadoop applier) released by the MySQL Team aims to solve this problem.

Purpose

For example, the slave server in the replication event may be a data warehouse system, such as Apache hive, which uses the hadoop Distributed File System (HDFS) as the data storage zone. If you have a hive MetaStore related to HDFS, hadoop applier can populate the hive data table in real time. Data is exported to HDFS as a text file from MySQL, and then populated to hive.

The operation is very simple. You only need to run the hiveql statement 'create table' in hive, define the table structure similar to MySQL, and then run hadoop applier to start real-time data replication.

Advantages

Before hadoop applier, no tools can be used for real-time transmission. The previous solution is through Apache
Sqoop exports data to HDFS. Although data can be transferred in batches, it is necessary to import the results repeatedly to keep the data updated. When a large amount of data is transmitted, other queries become slow. In addition
If you make only one change, sqoop may take a long time to load.

WhileHadoop applier reads binary logs, only applies events on the MySQL server, and inserts data. It does not require batch transmission, making operations faster.So it does not affect the execution speed of other queries.

Implementation

Applier uses an API provided by libhdfs (C library used to operate files in HDFS. Shows the real-time import process:

The database is mapped as a separate directory, and their tables are mapped as subdirectories and a hive Data Warehouse directory. The data inserted into each table is written into a local file (such as datafile1.txt). The data is separated by commas (,) or other symbols (which can be configured through the command line ).

Details: MySQL applier for hadoop

: Mysql-hadoop-applier-0.1.0-alpha.tar.gz (alpha version, not available in production environments)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL releases applier to replicate data to hadoop in real time

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support