Common:in the2.2.0in most previous versions, it containsHDFS,MapReduceand other project public content, from2.2.0StartHDFSand theMapReduceare separated into separate sub-projects, the remainder of the content isHadoop Common.
Avro:new data serialization format and Transfer tool, will gradually replaceHadoopthe originalIPCmechanism.
MapReduce: Parallel Computing Framework,0.20before useorg.apache.hadoop.mapredold interface,2.2.0version started to introduceOrg.apache.hadoop.mapreducethe newAPI.
HDFS:HadoopDistributed File System(Hadoop distributed FileSystem).
Pig: Big Data analytics platform that provides users with a variety of interfaces.
Hive:data warehousing tools, byFacebookcontribution.
Hbase:similarGoogle BigTableof DistributedNoSQLcolumn database. (HBaseand theAvroalready in .years5month becomes the topApacheProject).
ZooKeeper:Distributed lock facility that provides similarGoogle Chubbythe function, byFacebookcontribution.
Sqoop:Sqoopis the one used toHadoopand the data in the relational database are transferred to each other, a relational database can be(For example:MySQL, Oracle, Postgreswait)Import the data from theHadoopof theHDFS, you can also addHDFSdata into a relational database.
Oozie:responsible forMapReducejob scheduling.
Big Data Core Technology