The traditional client/server architecture is coordinated and dispatched by the global things manager, which is tightly coupled.
The current distributed database system adopts the client/server model based on middleware.
Architecture of distributed large Data systems:
Principal and subordinate (Master-slave): Bigtable, Hbase
PEER-TO-PEER ring structure: Cassandra, Dynamo
Interpretation of the noun for large data:
HDFS: is a GFS implementation, his complete name is a distributed file system, similar to Fat32,ntfs, is a file format, is the underlying.
The parallel computing framework of the upper layer is mapreduce.
HBase, Bigtable: is a database, suitable for unstructured data storage database, based on the column rather than the line pattern. Compared with RDBMS, they are more suitable for mass storage and real-time query processing, and are suitable for Internet environment application.
Hive and Pig:hive use the class SQL language, which is similar to more data-oriented query and analysis, and the bottom is converted into mapreduce programs.