This article shows the method of building blocks of hadoop.
The Hadoop overall architecture is a distributed master/from architecture consisting of a set of daemons and a set of host programs, and daemons are: Namenode,datanode,secondary namenode,jobtracker,tasktracker
The Namenode,datanode,secondary namenode is divided into stored process classes, while Jobtracker and Tasktracker are divided into computational process classes.
Namenode:
Namenode is the master node of the Hadoop distributed Storage System (HDFS), which itself does not participate in I/O tasks, but instead gives these tasks to the datanode that it manages. Namenode the file system's metadata is stored in memory.
Datanode:
Datanode is a HDFS node from the Hadoop distributed storage System (slave node), which is responsible for the actual task of reading and writing HDFS blocks (a large file is divided into HDFS block) and continuously reporting status to Namenode.
Secondary Namenode:
Secondary Namenode is a worker process used in the cluster to monitor the state of the HDFs cluster. It is also not the same as namenode that it does not accept and record any real-time changes in HDFs. Instead, it deals only with Namenode, and periodically collects snapshots of HDFs states (snapshot), which are used primarily to restore work when the Namenode fails.
Job Tracker:
Job Tracker is our contact for applications and Hadoop, and when we submit code to the Hadoop cluster, it determines the execution plan, including deciding which files to process, assigning different tasks to each node (which is actually assigned to task Tracker, and then forwarding), and monitor all tasks that are running. This process typically runs on the primary node of the cluster.