A member of the family of Hadoop

Source: Internet
Author: User
Tags serialization zookeeper sqoop

Family members of Hadoop: Hive, HBase, zookeeper, Avro, Pig, Ambari, Sqoop, Mahout, Chukwa Hive: A data Warehouse tool based on Hadoop, The structured data file can be mapped into a database table, and a simple mapreduce statistic is realized quickly by the class SQL statement, so it is very suitable for the statistic analysis of data Warehouse without developing the special MapReduce application. Pig: is a large-scale data analysis tool based on Hadoop, it provides the Sql-like language is called Pig Latin, the language compiler will convert the class SQL data analysis request to a series of optimized processing MapReduce operation. HBase: A highly reliable, high-performance, column-oriented, scalable, distributed storage system that leverages HBase technology to build large structured storage clusters on inexpensive PC servers. Sqoop: A tool used to transfer data from Hadoop and relational databases to the HDFs of Hadoop, with data from a relational database (MySQL, Oracle, Postgres, etc.) HDFs data can also be directed into a relational database. Zookeeper: is a distributed, open source coordination service designed to distribute applications, which is mainly used to solve some data management problems that are often encountered in distributed applications, simplify the coordination and management of distributed applications, and provide high-performance distributed services Mahout: is a distributed framework for machine learning and data mining based on Hadoop. Mahout uses MapReduce to realize partial data mining algorithm, which solves the problem of parallel mining. Avro: is a data serialization system designed to support data-intensive, High-volume data interchange applications. Avro is a new data serialization format and transmission tool that will gradually replace the existing IPC mechanism of Hadoop Ambari: It is a web-based tool that supports the provisioning, management, and monitoring of Hadoop clusters. Chukwa: An Open-source data collection system for monitoring large distributed systems, which collects a variety of types of data into files that are suitable for hadoop processing for various MapReduce operations in HDFS.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.