Facebook's system architecture

Source: Internet
Author: User
Keywords They some themselves servers
Tags aliyun application server based business business logic compiling conversations custom

Based on my current readings and conversations, I understand today's http://www.aliyun.com/zixun/aggregation/1560.html"> Facebook is structured as follows:

Web front end is written by PHP. HipHop from Facebook converts PHP into C ++ and compiles with g ++, which provides high performance for templates and Weblog business layers.

Business logic exists as a Service, which uses Thrift. These services according to different needs by PHP, C + + or Java (you can also use some other languages ​​...)

Services written in Java do not use any enterprise application server, but use Facebook's own custom application server. It looks like reinventing the wheel, but these services are only exposed to Thrift (the vast majority are the same), Tomcat is too heavyweight, even Jetty may be a bit too far for its value added to what Facebook needs Pointless.

Persistence is done by MySQL, Memcached, Facebook's Cassandra, Hadoop HBase [5]. Memcached uses MySQL's memory cache. Facebook engineers admit that their use of Cassandra is declining because they prefer HBase because of its simpler consistency model to its MapReduce capabilities.

Offline processing using Hadoop and Hive.

Log, click, feeds data Scribe, aggregate and exist HDFS, which uses Scribe-HDFS, allowing extended analysis using MapReduce.

BigPipe [8] is their customization technology to speed up page displays.

Varnish Cache [9] serves as an HTTP proxy. The reason they use this is speed and efficiency.

Used to store the billions of photos uploaded by users. It was handled by Haystack. Facebook developed an Ad-Hoc storage solution by itself. It did some low-level optimization and "append-only" writing [11].

Facebook Messages uses its own architecture, which is clearly built on a dynamic cluster infrastructure. Business logic and persistence are encapsulated in a so-called 'Cell'. Each 'Cell' handles part of the user, and the new 'Cell' can be added due to the popularity of the visit. Persistent archive using HBase.

The Facebook Messages search engine is built from an inverted index stored in HBase.

Facebook search engine implementation details as far as I know is currently unknown state.

Typeahead search uses a custom storage and retrieval logic.

Chat is based on an Epoll server, developed by Erlang and accessed by Thrift

Here are some of the information and the amount of resources that are available to the above components, but some are unknown:

Facebook estimates there are more than 60,000 servers [16]. Their latest data center in Prineville, Oregon, is based on fully custom designed hardware that was recently released as Open Compute.

300 TB of data exists in Memcached

Their Hadoop and Hive cluster consists of 3000 servers, each with eight cores, 32GB of memory, 12TB of hard drive, 24K of CPU cores, 96TB of memory and 36PB of hard drive.

100 billion daily hits, 50 billion photos, 3 trillion objects being cached, 130 TB daily logs (July 2010 data)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.