PHP Build millions website Architecture technology secret: German Social

Source: Internet
Author: User
Tags php server website performance rabbitmq couchdb
PHP Build millions website Architecture technology secret: German Social

After learning about the world's largest php site, Facebook's back-end technology, today we're going to look at a millions PHP site's site architecture:, a social networking site in Germany, is a very small site relative to Facebook and Flickr, but it has a very good architecture that incorporates many technologies such as Nigix,MySql, CouchDB, Erlang, Memcached, RabbitMQ, PHP, Graphite, Red5, and Tsung.

Statistical information

2 million number of registered users;

20,000 number of concurrent users;

200,000 private messages per day;

250,000 login times per day;

The project team has 11 developers, two design, two system administrators;

Business model

The website uses the free value-added mode and users can use any of the following services free of charge:

Search for other users;

Send a message to a friend;

Uploading images and videos;

Find a friend;

Video chat;

More ...

However, if users want to enjoy unrestricted delivery of messages and upload images, they will have to pay for different types of membership services, video chat and other services on the site also use the same strategy.


Nginx all services are based on Nginx service. Two Nginx servers at the front end provide 150,000 requests per minute at peak times, each with a four-year lifespan and only one CPU and 3GB RAM. has three independent image servers , with three Nginx servers serving * 80,000 requests per minute.

A cool design in the Nginx architecture is that many requests are handled by memcached, so requests to get content from the cache do not require direct access to the PHP machine. For example, user profile is a site that requires intensive processing, and if the user information page is cached on memcached, the request is made to get content directly from memcached. The memcached can process 8,000 requests per minute.

The architecture has three Nginx image servers that provide local image caches, and users upload images to a central file server. When an image is requested in one of these three nginx, if the image does not exist locally on the server, it is downloaded from the central file server to the server for caching and serving. This load-balanced distributed Image Server architecture is designed to reduce the load on primary storage devices.


The site is running on PHP-FPM. There are 28 PHP machines with two CPUs and 6GB of memory, each running 100 php-fpm of work threads on each machine. Use the php5.3.x with APC enabled. PHP5.3 can reduce CPU and memory utilization by more than 30%.

Program code is developed based on the Symfony1.2 framework. The first is the use of external resources, and the second is to improve the project development progress, while in a well-known framework can make it easier for new developers to join the team. While nothing is perfect, there are many benefits from the Symfony framework that will allow the team to focus more on's business development.

Website performance optimization Using XHPROF, this is a class library of Facebook open source. This framework is very easy to personalize and configure, and can cache most of the high-cost server computations.


MySQL is the main RDBMS of the website. Several MySQL servers: a 4CPU, 32GB server stores user-related information, such as basic information, photo description information. This machine has been in use for 4 years, and the next plan is to replace it with a shared cluster. The design is still based on this system to simplify the data access code. Partition data according to user ID, because most of the information in the site is user-centric, such as photos, videos, messages, and so on.

There are three servers that provide user forum services on a master-slave-from-configuration architecture. One from the server responsible for Web site custom message store, there are now 250 million messages. The other four machines are dominated-from the configuration relationship. In addition, 4 machines are configured to serve the NDB communities with intensive write operations, such as user access statistics.

The data table design avoids the associated operation as much as possible and caches the most data. Of course, the structural specifications of the database have been completely destroyed. Therefore, to make it easier to search, the database design creates a data mining table. Most tables are MyISAM-type tables that provide quick lookups. The problem now is that more and more tables are locked in the full table. is considering migrating to the XTRADB storage engine.


The memcached application in the site architecture is quite large, with more than 45GB of cache and 51 nodes. It caches session sessions, view caches, and function execution caches. There is a system in the schema that can automatically update the data to the cache when the record is modified. A possible scenario for future improvements to cache updates is to use the new Redis Hash API or MongoDB.


Start using RABBITMQ in the schema for 2009 years. This is a good messaging solution that is easy to deploy and centralize into this architecture, running two RABBITMQ servers after LVs. In the last month, more things have been integrated into the queue, meaning that at the same time there are 28 PHP servers processing 500,000 requests per day. Send logs, mail notifications, system messages, image uploads, and more to this queue.

Using the Fastcgi_finish_request () function in PHP-FPM to integrate queue messages, messages can be sent asynchronously to the queue. This function is called when the system needs to send an HTML or JSON-formatted response to the user, so there is no need for the user to wait until the PHP script is cleaned up.

This system can improve the management of architecture resources. For example, the service can handle 1000 login requests per minute at peak times. This indicates that there are 1000 concurrent update user tables that save the user's logon time. Because of the queue mechanism, these queries can be run in reverse order. If you need to increase processing speed, you only need to add more queue processor, you can even add more servers to the cluster, without the need to modify any configuration and deploy new nodes.


The log storage couchdb runs on a single machine. On this machine you can log queries/groupings based on module/behavior, or depending on the type of error. This is useful for locating problems. Before using the Log aggregation service couchdb, it is very troublesome to log on to the PHP server to try to locate the problem by logging on to the computer. Now you can focus on problem checking and analysis by centralizing all the logs in the queue and saving them to couchdb.


The website uses graphite to collect real-time information and statistics. From requests for each module/behavior to memcached of hit and miss, RABBITMQ status monitoring, Unix load, and so on. The graphite service has an average of 4,800 update operations per minute. Practice has proven to be very useful in monitoring what happens to websites, and its simple text protocol and drawing capabilities can be easily plug and play on any system that needs to be monitored.

One cool thing is to use graphite to monitor two versions of the site at the same time. A new version of the Symfony framework was deployed in January, and the previous code was deployed as a backup. This means that the site may be experiencing performance problems. So you can use graphite to compare two versions online.

The UNIX load table on the new version was found to be high, so a performance analysis of two versions was performed using Xhprof to find out where the problem was.


Web site for users also provide two types of video services, one is the user upload video, the other is video chat, user video interaction and sharing. By 2009, 17TB per month for users to provide traffic services.


Tsung is a distributed benchmark analysis tool written in Erlang. The Web site is used primarily for HTTP benchmarking, and for the comparison of MySQL with other storage systems (XtraDB). A system is used to record the traffic of the primary MySQL server and convert it into a Tsung base session. The traffic is then replayed, resulting in thousands of concurrent users accessing the lab's servers by Tsung. This can be very close to the real scene in the experimental environment.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.