Facebook, the world's leading social networking site, has more than 300 million active users, with about 30 million users updating their status at least once a month, with users uploading more than 1 billion photos and 10 million videos a week, and sharing 1 billion weekly content, including logs, links, news, tweets, etc. So the amount of data that Facebook needs to store and process is huge, adding 4TB of compressed data a day, scanning 135TB of data, performing hive tasks on the cluster more than 7,500 times, and 80,000 calculations per hour, So high-performance cloud platforms are important for Facebook, and Facebook uses the Hadoop platform primarily for log processing, referral systems, and data warehousing.
Facebook stores data in a data warehouse built using Hadoop/hive, which has 4,800 cores, 5.5PB of storage, each node can store 12TB of data, and it has a two-layer network topology, as shown in the following illustration. The MapReduce cluster in Facebook is dynamically changing, based on the load and configuration information between cluster nodes.
▲ Cluster network topology
The image below is the Data Warehouse architecture of Facebook, where Web servers and internal services generate log data, where Facebook uses the open source log collection system, which stores hundreds of log datasets on NFS servers. But most of the log data is replicated to the HDFs instance of the same center, and HDFs stored data is placed in a data warehouse built using hive. Hive provides a SQL-like language to combine with MapReduce, create and publish multiple summaries and reports, and analyze them on a historical basis. The browser-based interface on the hive allows users to perform hive queries. Oracle and MySQL databases are used to publish these summaries, which are relatively small in size but have a high frequency of queries and require real-time response.
▲facebook Data Warehouse Architecture
Some old data needs to be archived in time and stored on cheaper storage, as shown in the following illustration. Here are some of the things that Facebook does with avatarnode and scheduling strategies. Avatarnode is mainly used for HDFs recovery and start-up, if HDFs crashes, the original technology recovery takes 10-15 minutes to read 12GB file mirroring and write back, and 20-30 minutes to process data block reports from 2000 Datanode, Finally, 40-60 minutes to restore the crash of namenode and deployment software. Table 3-1 illustrates the difference between Backupnode and Avatarnode, Avatarnode as a normal namenode startup, processing all messages from Datanode. Avatardatanode is similar to Datanode and supports multiple queues for multiple master nodes, but cannot differentiate between raw and backup. Manual recovery uses the Avatarshell command-line tool, Avatarshell performs a recovery operation and updates zookeeper Znode, and the recovery process is transparent to the user. The distributed Avatar File system is implemented on the upper level of the existing file system.
▲ Data archiving
Table: Differences between Backupnode and Avatarnode
There are some problems in the application of position-based scheduling strategy: If high memory task is required, it may be allocated to tasktracker with low memory; CPU resources are sometimes underutilized, and it is difficult to configure Tasktracker for different hardware. Facebook uses a resource-based scheduling strategy that equitably enjoys scheduling, monitors the system in real time, and collects CPU and memory usage, and the scheduler analyzes real-time memory consumption, and then distributes the task's memory usage fairly among tasks. It parses the process tree by reading the/proc/directory, collects all CPU and memory usage information on the process tree, and then sends information through Taskcounters in the Heartbeat (Heartbeat).
Facebook's Data Warehouse uses hive, which is structured as shown in the following illustration, and the relevant knowledge of hive Query language can be found in chapter 11th. Here HDFs supports three file formats: text files (textfile), easy to read and write to other applications, sequential files (sequencefile), only Hadoop can read and support chunk compression; Rcfile, using sequential files based on block storage, each block is stored in columns, which has better compression rate and query performance. Facebook will be improved on hive in the future to support new features such as indexes, views, subqueries, and so on.
▲hive Architecture
The challenges Facebook has now faced with Hadoop are:
• Quality of service and isolation, the larger task will affect the performance of the cluster;
• Security, what to do if software vulnerabilities cause Namenode transaction log crashes;
• Data archiving, how to choose archived data, and how data is archived;
• Performance improvement, how to effectively solve bottlenecks.
Author: Lu Jiaheng, "Hadoop Combat" author, associate professor of Renmin University of China, Ph. D., PhD, University of California, Irvine (University of California, Irvine) postdoctoral fellow.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.