Introduction to the Twitter site architecture

Source: Internet
Author: User
Tags ruby on rails

Http://www.kaiyuanba.cn/html/1/131/147/7539.htm as a 140-word creator, Twitter is too simple and too complicated, Simple is because only 140 words actually make a number of world events spread faster than any media, complex is because to provide 200 million users this seemingly simple 140 words of service, this is really because of simple, so complex. However, it is a pity that Twitter is inaccessible in mainland China, but as an architecture-loving program ape, this wall must be turned upside down, and the world outside the wall is more exciting. Today, with some of the information on the network, to talk about my experience of the structure of the Twitter site, hoping to give a passing friend a little revelation ....

I. Overview of the Twitter site Basics
Until April 2011, Twitter has a registered user of about 175 million, and the number of new users registered to 300000 per day growth, but its real active users far less than this number, most of the registered users are no followers or no attention to others, which is not comparable with Facebook's 600 million active users.
Twitter has 1.8 million independent access users per month, and 75% of traffic comes from sites outside of Twitter.com. The API has 3 billion requests per day, averaging 5,500 tweet,37% active users per day, and about 60% of tweets come from third-party apps.
Platform: Ruby on Rails, Erlang, MySQL, Mongrel, Munin, Nagios, Google Analytics, AWStats, Memcached
Is the overall architecture design for Twitter:

Second, the platform of Twitter
The Twitter platform is roughly comprised of twitter.com, mobile phones, and third-party applications, as shown in:

The main source of traffic is mobile phones and third parties.
Ruby on Rails:web application framework
Erlang: Common concurrency-oriented programming language, open source project address: http://www.erlang.org/
AWStats: Real-time log Analysis system: Open source project address: http://awstats.sourceforge.net/
Memcached: Distributed Memory Cache Build
Lightweight Message Queuing developed by Starling:ruby
Varnish: High performance Open source HTTP accelerator
Kestrel:scala written by the message middleware, open source project address: Http://github.com/robey/kestrel
Comet Server:comet is an Ajax long-connected technology that enables servers to proactively push data to a Web browser to avoid the performance penalty of client polling.
Libmemcached: a memcached client
Using the MySQL database server
Mongrel:ruby HTTP server, dedicated to rails, open source project address: http://rubyforge.org/projects/mongrel/
Munin: Server-Side Monitoring program, project address: http://munin-monitoring.org/
Nagios: Network Monitoring system, project address: http://www.nagios.org/
Third, Cache
Talking about caching, it's true that caching plays an important role in large Web projects, after all, the closer the data gets to the faster the CPU accesses. Is the twitter cache architecture diagram:

Large use of memcached for caching
For example, if you get a count that is very slow, you can throw count into memcached within 1 milliseconds
Getting a friend's status is complicated, and there are other issues such as security, so a friend's status is updated after it is thrown in the cache instead of making a query. No access to the database
The ActiveRecord object is large so it is not cached. Twitter stores the properties of critical in a hash and Sanga when accessed
90% of requests are API requests. So do not do any page and fragment cache on the front end. Pages are very time sensitive and inefficient, but Twitter caches API requests
In the memcached cache strategy, there are some improvements as follows:
1, create a write-through vector cache vector caches, contains a tweet ID array, tweet ID is a serialized 64-bit integer, hit rate is 99%
2. Join a write-through row cache, which contains database records: Users and tweets. This cache has a 95% hit rate.
3. Introduced a read-only fragment cache Fragmeng cache, which contains the sweets serialized version accessed by the API client, which can be packaged in JSON, XML, or Atom format, and also has a 95% hit rate.
4. Create a separate cache pool for page caches page cache. The page cache pool uses a generational key pattern instead of a direct effect.
Iv. Message Queuing
Use a lot of messages. Producers produce messages and put them in queues, which are then distributed to consumers. The main function of Twitter is as a message bridge between different forms (Sms,web,im, etc.)
Using DRB, this means distributed ruby. There is a library that allows you to send and receive messages from a remote Ruby object over TCP/IP, but it's a bit fragile
Move to Rinda, which is a share queue using the Tuplespace model, but the queue is persistent and the message is lost when it fails
I tried Erlang.
Move to Starling, a distributed queue written in Ruby
Distributed queues are used to save system crashes by writing them to the hard disk. Other large sites also use this simple way
v. Summary
1, the database must be reasonably indexed
2. To be aware of your system as quickly as possible, it is necessary that you have the flexibility to use a variety of tools
3. Cache, cache, cache, cache everything can be cached, let your app fly up.

Introduction to the Twitter site architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.