1. HDFS (Distributed File system system)
1.1, NameNode: (Name node)
HDFs Daemon
How the record files are partitioned into chunks, and on which nodes the data blocks are stored
Centralized management of memory and I/O
is a single point, failure will cause the cluster to crash
1.2, Secondarynamenode (auxiliary name node): Failure to manually set up to achieve cluster crash problem
Auxiliary daemon for monitoring HDFs status
Each cluster has a
Communicate with Namenode to save HDFs metadata snapshots on a regular basis
With Namenode faults can be used as backup Namenode
1.3, Dataname (Data node)
Each server runs a
Responsible for reading and writing HDFS data blocks to the local file system
2. MapReduce
2.1, Jobtracker (Job Tracker)
Background program for processing jobs (user-submitted code)
Decide which files to participate in processing, and then cut the task and assign and
Monitoring task, restarting failed task
Each cluster has only a unique jobtracker, located on the master node
2.2. Tasktracker (Task Tracker)
Located on slave and Datanode, combined with
Manage tasks on the respective nodes (assigned by Jobtracker)
Each node has only one tasktracker, but each tasktracker can start multiple JVMs to perform the map or reduce tasks in parallel
Interacting with Jobtracker
Master: Server running Namenode sencondarynode jobtrack
Slave: Servers running Datanode and Tasktrack
Mapreduce
Hadoop Hdfs&mapreduce Core Concepts