MySQL releases applier to replicate data to hadoop in real time

Source: Internet
Author: User

The MySQL replication operation can copy data from one MySQL Server (master) to one or more other MySQL servers (slave ). Imagine,If the slave server is no longer limited to a MySQL server, but any other database server or platform, and the replication event requires real-time execution, can it be implemented?

The latest MySQL applier for hadoop (hadoop applier) released by the MySQL Team aims to solve this problem.

Purpose

For example, the slave server in the replication event may be a data warehouse system, such as Apache hive, which uses the hadoop Distributed File System (HDFS) as the data storage zone. If you have a hive MetaStore related to HDFS, hadoop applier can populate the hive data table in real time. Data is exported to HDFS as a text file from MySQL, and then populated to hive.

The operation is very simple. You only need to run the hiveql statement 'create table' in hive, define the table structure similar to MySQL, and then run hadoop applier to start real-time data replication.

Advantages

Before hadoop applier, no tools can be used for real-time transmission. The previous solution is through Apache
Sqoop exports data to HDFS. Although data can be transferred in batches, it is necessary to import the results repeatedly to keep the data updated. When a large amount of data is transmitted, other queries become slow. In addition
If you make only one change, sqoop may take a long time to load.

WhileHadoop applier reads binary logs, only applies events on the MySQL server, and inserts data. It does not require batch transmission, making operations faster.So it does not affect the execution speed of other queries.

Implementation

Applier uses an API provided by libhdfs (C library used to operate files in HDFS. Shows the real-time import process:

The database is mapped as a separate directory, and their tables are mapped as subdirectories and a hive Data Warehouse directory. The data inserted into each table is written into a local file (such as datafile1.txt). The data is separated by commas (,) or other symbols (which can be configured through the command line ).

Details: MySQL applier for hadoop

: Mysql-hadoop-applier-0.1.0-alpha.tar.gz (alpha version, not available in production environments)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.