The world's largest PHP site Facebook backstage Technology Quest

Source: Internet
Author: User
Tags cassandra varnish
The world's largest PHP site Facebook backstage Technology Quest

At this year's Facebook F8 developer Conference, 51CTO takes you through its latest open-plan strategy and semantic search. Today, we'll look at the software behind Facebook and see how Facebook is ensuring that 500 million of users ' systems are running reliably, as one of the most visited sites in the world today.

Facebook's Scalability Challenge

Before we discuss the details, here are some of the software sizes Facebook has already done:

Facebook has 570000000000 page views per month (according to Google Ad Planner)

Facebook has more photos than all other image sites (including sites like Flickr)

More than 3 billion photos per month are uploaded

Facebook's system service processes 1.2 million photos per second, excluding photos processed in the CDN service

More than 2.5 billion items per month (status updates, reviews, etc.) are shared

Facebook has more than 30,000 servers (this number was last year)

The software The Facebook extension relies on

Facebook is, to some extent, still a lamp site, but it is much larger than regular lamp to incorporate other elements and many services, and modify the current practice.

For example:

Facebook still usesPHP, but it has built a compiler for it so that it can be divided into local code to open the Web server, thereby improving performance.

Facebook usesLinux, but he was particularly optimized for network throughput.

Facebook uses mySQL, but primarily as a key-value persistent store, jions and server logic operations operate on the Web server. Because it's easier to execute there.

There are also self-written systems, such as haystack, a highly extensible object store for storing photos of Facebook. There is also scribe, a log system that can run on Facebook on a huge scale on the log system.

Now let's introduce the software used in the world's largest social networking site.

Memcached

Memcached is now one of the most famous software in the Internet. This is a distributed memory cache system that is used as a Web server andMySQLThe cache layer between servers (becauseDatabaseAccess is slow). Over the years, Facebook has put forward some ways to optimize memcached and some of its peripheral software. such as compressing the network stack.

Every moment on Facebook, 10TB of data is cached on thousands of servers in memcached. It is probably the largest memcached cluster in the world.

HipHop for PHP

PHP, as a scripting language, is slow to run compared to native programs. Hiphop can convert PHP into C + + code and then compile it for better performance. Because Facebook relies heavily on PHP, it makes it more efficient for the Web server to run.

A small team of engineers took 18 months to develop hiphop on Facebook (three people at first) and is now available.

Haystack

Haystack is Facebook's high-performance photo storage/retrieval System (strictly speaking, it's an object store, so it doesn't have to store photos). It has a lot of work to do, there are more than 2 billion uploaded photos, and each one is saved in four different resolutions, so there are more than 80 billion photos.

It's not just about hundreds of millions of photos that can be processed, it's also vital to run performance. As we mentioned earlier, the Facebook service is about 1.2 million photos per second, and this number does not include CDN. This is a staggering number. For a picture of Facebook, please refer to the report "Facebook Image storage Architecture Technology full resolution" before 51CTO.

Bigpipe

Bigpipe is a dynamic Web services system developed by Facebook. Facebook uses it to process each page by section (called "Pagelets") for best performance.

For example, the Chat window is separate, the news feed is also separate, and so on. These pagelets can be used at the same time as a page performance, which is obtained when the page behaves. Even if some of the projects are closed or mid-end, users can also get a portion of the Web page.

Cassandra

Cassandra is a distributed storage system that is not a single point of failure. This is an important part of the NoSQL movement and has been exposed to source code (it even became an Apache project). Facebook uses it in the search function.

In addition to Facebook, some people also use it, such as Digg. But Twitter has recently abandoned Cassandra. More about Cassandra can refer to the topic of 51CTO "towards freedom?" Cassandra Database application Guide.

Scribe

scribe is a flexible journaling system, and Facebook is heavily used within his own. It's capable of handling large-scale logging on Facebook, and automatically handles new logging categories, with Facebook having hundreds of log categories (categories).

Hadoop and Hive

Hadoop is an open-source map-reduce implementation that allows it to perform operations on big data. Facebook uses this for data analysis (and we all know that Facebook has a lot of data). Hive is originated from Facebook, making it possible for SQL queries to be used with Hadoop, making it easier for non-programmers to use.

Hadoop and Hive are open source (Apache projects), with a large number of followers, such as Yahoo and Twitter.

Thrift

There are several different languages and different services that Facebook uses. PHP is ultimately used for the front end, Erlang is used for chatting,Javaand C + + are also used in a variety of places, and perhaps in other languages. Thrift is an internally developed cross-language framework that contacts languages so that they can work together so that they can interact with each other. This makes it easier for Facebook to continue to maintain its cross-languageDevelopment。

Facebook has let thrift open source. More language support has been added to thrift.

Varnish

Varnish is an HTTP accelerator that can act as a load balancer and cache the contents, which can then be delivered at lightning speed.

Facebook uses Arnish to process photos and profile pictures and handle billions of of daily requests. Like everything else, varnish is open source.

Other things that keep Facebook running smoothly

We have mentioned the software that makes up the Facebook system and helps to run on a large scale. However, dealing with such a large system is a complex task, so we will list some of the other things that they keep on the smooth running of Facebook.

Progressive release and Dark start

Facebook has a so-called Gatekeeper system (Gatekeeper) that allows them to run two different sets of systems for different users. This allows Facebook to progressively release features, A/B testing, and only some features that are released for Facebook employees.

Gatekeeper also allows Facebook to implement a "dark start", which activates certain features before the user can use some features (because the user is unaware, so called a dark boot). This will serve as a real-world stress test, before the official launch, to help uncover some of the dysfunction and otherproblem。 The dark start is usually two weeks before the official launch.

Profiling's Live system

Facebook's careful monitoring of its system, interestingly, it is also responsible for monitoring the performance of each PHP function in the production environment. Detects the configuration operation of each PHP environment. Use open source Tools, xhprof.

Progressive use of shutdown to improve performance

If there is a performance issue with Facebook running, one way to do this is to step off the less-important features to enhance Facebook's many core features.

Something we didn't mention.

We don't mention hardware-related things, but this is also an important part of improving scalability. For example, like other large sites, Facebook uses a CDN to handle static content. Facebook also has a huge data center that can help him expand more services.

Facebook's Open source scenario

Not only does Facebook use (and help) open source software such as linux,memcached, MySQL and Hadoop, as well as many other cases, it also contributes many of its in-house developed software.

Facebook is also open source for tornado, a high-performance Web server framework developed by the FriendFeed team. For an open source software listing, you can find it on Facebook's Open source page.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.