1. the anatomy of a large-scale hypertextual Web Search Engine
Http://infolab.stanford.edu /~ Backrub/google.html
Http://cs.ucsb.edu /~ Chong/250c/google.pdf .pdf
2. The Google File System
Http://labs.google.com/papers/gfs.html
Http:// OS .inf.tu-dresden.de/Studium/DOS/SS2010/04-GFS-2.pdf
3. mapreduce
Http://labs.google.com/papers/mapreduce.html
Http://labs.google.com/papers/Mapreduce-Osdi04.pdf
4. bigtable
Http://labs.google.com/papers/Bigtable. Html
5. hadoop
Http://hadoop.apache.org
Hadoop is a distributed system for massive data storage and computing based on the shared-nothing architecture. It consists of several Members, including HDFS, mapreduce, hive, hbase, pig, and zookeeper, HDFS is Google's GFS open-source implementation, while zookeeper is Google's chubby open-source version, while hbase is Google's bigtable open-source version. HDFS features high fault tolerance and strong linear expansion.