Title: Architecture and Data of 0.3 billion PV Facebook shopping plug-ins per month
Source: Spring career
Time: Wed, 07 Oct 2009 22:48:48 + 0000
Author: xiechunye
Address: http://www.xiechunye.cn/read.php/672.htm
Content:
In just three months, the top 10 Facebook applications were sold by friends. They processed 200 requests every second and generated 300 Page views every month. Technically, I chose the Ruby on Rails framework, two part-time programmers, a dozen servers, and excellent architectures.
Architecture
L Ruby on Rails
L centos 5 (64bit)
L Capistrano upgrade and restart the Application Server
L memcached
L MySQL
L nginx
L Starling Distributed Queue Service
L softlayer bastion host
L pingdom Site Monitor
L LVM logical volume manager
L dr. replica magic multi-connections gem database read/write splitting
Current situation
L top 10 Facebook applications
L nearly 600 active users
L access to 500000 independent users per day and sustained growth
L 300 000 page view monthly
L stable growth rate of 300% per month
L 21st day of last month 00 000 independent users
L process 200 requests per second
L 5 TB of traffic per month
L two part-time developers (currently one full-time) and one remote DBA
L 4 database servers, 6 application servers, a staging server, and a front end server
Six 4-core 8 GB memory application servers
A total of 96 mongrels are created for each app server.
System design goals:
Support Facebook applications-buy and sell friends.
It is based on a fluctuating financial market.
It is currently the top 10 most popular Facebook applications.
TIPS: like buying and selling pets! You can play with him, give gifts, or show off. A savvy investor will turn his friends into people's goods. In fact, this is just to satisfy
Why is this system:
This system is designed to test and understand the Facebook system.
When designing a system, what are the challenges and innovations in design, architecture, and implementation?
As Facebook applications, each request cannot use cached pages. Therefore, it is a system with frequent write operations and has high requirements for database optimization.
How to deal with these challenges?
We use memcached as the middle layer, and each request does not directly access the database. Use rail's fragment caching to cache the presentation layer.
What is the current system scale?
Yesterday, the access volume was 500 independent users, and the access volume continued to grow. According to statistics, 300 Page views are displayed this month.
Bandwidth usage
Last month, 3 t of traffic was generated. This month, at least 5 T, only some icons and XHTML/CSS.
Number of files, number of images, and data
There are no files and user information is available. There are only a few images.
Growth rate?
Page view increases from an average of 3 m every day to 10 m every day a month ago, and 1 m in the previous month. Therefore, the average monthly growth rate of 300% processes 200 visits per second.
Do users need to pay?
All free
User growth rate
An average of 1%, with a growth rate of 3% per day.
How many user activities were there last month?
According to Google statistics, 100 million independent users visited the website last month.
How does the system architecture work?
First, it is built on a very stable rails cluster and uses nginx software to run load balancing and static content services. The six application servers use 4-core CPU 8 GB memory. Each application server runs 16 mongrels, with a total of 96 mongrels. Server Load balancer directly redirects to mongrels ports. Also, a 4G memcache Server is an independent column-based server.
Use God to detect processes.
Data layer, uses two 32 GB memory, 4-core CPU, 415 k scsi raid 10 disks. Use the Dr Nic's magic multi-connection's gem product for read/write splitting.
Now more slave servers are added for higher Read efficiency and redundancy.
Managed Service Providers.
How to plan the architecture
Nothing is done at the application layer, because the functions are very trivial. In terms of databases, we only have one primary database for fast distribution as much as possible. Vertical database partitioning improves availability.
What's unique?
Three aspects:
1. Neither of the two developers has ever performed large-scale rails development.
2. Our growth rate is rare in the history of rails development.
3. Almost no cache. Each request is directly processed using rails.
What have you learned? Why did you succeed? Do you want to do other things in the future?
We know that good hosting, good hardware, and good DBA are all very important. We once chose railsmachine, which is a very good space provider, but it gives us great support. After that, I almost never encountered any hardware problems, and it took two hours to smoothly switch to the softlayer. It is especially important to choose a good hosting.
Another important thing is that the Scalable Architecture always has problems with databases. First, check the database. Generally, it can be solved through the database server, database configuration, query, and index.
It is built on a good producer.
We use rails because it is a free and fast-developed program. Practice has proved that the two boys have completed development without sufficient time.
How are your groups made up?
We have two rails programmers, including me. In addition, the DBA who once worked as a part-time remote office was recruited.
How many people do you have?
In terms of technology, two part-time employees, one full-time and one remote dBA.
Where are you now?
Two full-time staff members are still in the Soma area of San Francisco.
What are the responsibilities of these people?
Two developers, as the founders, started front-end development and program development. After some experience, I also work on network management. Founder Alex is very focused on Rails development, and most applications are developed by him. Now I am mainly engaged in database work.
What are the unique management methods?
First, find the smartest person, give the best treatment, and do your best. The best way to manage the company is to do its best. I use this method to manage the company. I think I often have problems here.
How do you make a scattered team work?
There is a good communication tool. Remote office is quite painful. The core development should be local, and some dBA and other things can be remote.
Development Status
Using rails, many caches use the Chris wanstrath solution, and database connections use Dr Nic. We use Vim as the editor.
Development language
Ruby/rails
Number of servers
12 servers
Server Application
Four databases, six applications, one staging server, and one front-end server
Distributors
Softlayer
Server Operating System
Centos 5 64bit
Web Server
Nginx
Database Software
For MySQL 5.1
Reverse Proxy
Nginx
How to deploy
Delegate softlayer
Storage
NAS backup, supported by SCSI hard disks.
Storage Capacity
5 TB
How to expand storage
Ad-hoc. We don't care about this. This is our weakness.
Storage Server
Nope
How to handle session
Database, memcache may be better
How to plan Databases
Currently, the master and slave nodes are used. Use Server Load balancer to improve read performance.
Load Balancing
Nginx
Framework and Ajax class library used
Rails
Whether Message Service is used
No
What distributed task management system is used?
Starling (Queue Management)
How to deal with advertising services
Use ecpm
Whether a standard API is used
Nope
How many members are there in the team?
Two developers
Team skills
Me: frontend development and development (Rails ). Recently, I focused on developing databases and highly scalable rails.
ALEX: application development, front-end development, and program architecture design.
Development Environment
Alex uses OSX. on Ubuntu, we use SVN for synchronization. I use the vim editor and Alex uses textmate.
Development Progress
The logic layer, test-driven development, and iterative development at the application layer.
Cache Policy
Use memcache no TTL to manually set expiration time
Whether the client cache mechanism is used
No
System Management
How to ensure performance
Use the pingdom tool for web page performance monitoring.
Server and network availability Detection
Now we use our own detection tool and Ping detection tool provided by softlay. Recently, we developed fiveruns as a server monitoring tool.
Network and server performance icons
Not done
How to test the system
We conduct tests in different modules and deploy the completed parts on the application server.
Performance Analysis
Analyze each SQL statement to ensure efficiency. No test criteria.
How to ensure security
Be careful
Those features need to be improved/maintained
Reflection and criticism. We are very cautious about adding some features.
How to perform Web Analysis
Google Analytics is also used for virus detection using a growing detection job.
Whether to perform a/B Testing
Always doing
How to set up a data center
How to back up and reply to the System
LVM, Incremental Backup every week, and basic backup every day
How to upgrade software and hardware
Do it manually, unless there is a new application. Use Capistrano to upgrade and restart the application server.
How to upgrade the primary database
Generally, the master database is switched first, and then the master database is switched back after the upgrade.
Your Development Plan
Not very good
Do you have an independent operation team?
Hope to have
Whether to use the Content Delivery System
Nope
Profit Model
CPM, with a high revenue. It also makes profits through virtual currency.
How to marketing your products
Reputation. Virus pyramid schemes.
What's unique in algorithms?
Ruby is already excellent. We only need simple applications.
Whether to store images in the database
No. This is not good.
What have you done in front-end design?
You don't know what will happen before it happens. Once you have done this, you will have complete knowledge to solve the next problem.
Are there any good or bad things worth attention?
Unreliable hardware is difficult for communications between producers. The most important thing is to select a vendor that supports your applications. (Sun added: it seems that their O & M is done by IDCs ). Another important thing is how long it will take for commercial hardware to make a master-slave configuration. You can easily support 1 billion-level access.
When will the system add new extensions?
If you don't have this plan, you can plan it only when something comes.
What is worth carrying forward?
Memcache, you can split your architecture at will
Whether to adjust the architecture in the future
We will immediately partition the user's database, because it will reach the database limit immediately.
Facebook marketing ideas
L Facebook successfully digitalized social relationship networks
L future social relationships are important
L Facebook put the rapidly developing social relationships on the Internet
L your application creativity can be social, attractive, and universal.
L socialized virus marketing
L Monetization
L widespread potential
L friend trading is a society because you can buy and sell your social relationships
L it is very interesting because it is just a concept and has no pressure and a bit of fun.
L it is fair because everyone is virtual, has a price, and wants to become attractive.
L each application may develop some new users
L from the growth index, everyone can influence 1.4 people
L each user sends many invitations to view announcements, read feeds, view user information, and other projects
L each channel can track users' clicks, modifications, and uninstallation.
Gains in this section
L scalability is a direction of Facebook from the beginning. They achieved a daily PV of PV in a week.
L Ruby on Rails scalability
L scalability is manifested in architecture, focusing on Architecture and Operation
L you need a good dBA, a good registrar, and reasonable hardware
L Using Cache and high-load hardware, you do not need to optimize the load database
L social relations are true. They are built on Facebook users and have excellent virtual applications.
L most of the problems are database, database server, database configuration, query, and index.
L users still try vi
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/phphot/archive/2008/12/16/3533094.aspx
Generated by Bo-blog 2.1.0