1) NameNode, Datanode and client
Namenode can be regarded as the manager of Distributed File system, which is mainly responsible for managing the namespace of file system, cluster configuration information and storage block replication. Namenode will store the file system's Meta-data in memory, which mainly includes the file information, the corresponding file block of each file, and the information of each file block in Datanode.
Datanode is the basic unit of file storage, which stores blocks in the local file system, preserves block meta-data, and periodically sends all existing block information to Namenode.
The client is the application that needs to obtain the Distributed File System files.
2) File Write
The client initiates a request to the Namenode to write the file.
Namenode returns information to the client that it manages partially Datanode, based on file size and file block configuration.
The client divides the file into blocks, which are written sequentially to each Datanode block according to the Datanode address information.
3) file read
The client initiates a request to the Namenode to read the file.
Namenode returns the Datanode information for the file store.
The client reads the file information.
--------------------------------------------------------------------------------------------------------------- -------------------------------------------------
Introduction of Communication methods:
In the Hadoop system, the correspondence between master/slaves/client is:
Master---namenode;
Slaves---datanode;
Client---dfsclient;
What is the way to communicate, here from a general introduction:
In brief:
Between the client and the Namenode is through RPC communication;
Between Datanode and Namenode is through RPC communication;
Between the client and Datanode is through a simple socket communication.
Just unplug the Dfsclient code, you can see it has a member variable public final clientprotocolnamenode;
And then pull the Datanode code, you can see it also has a member variable public datanodeprotocolnamenode
Introduction to collaboration and communication between Namenode, Datanode, and client in Hadoop