The Twitter architecture, based on public information on the Internet, is mainly in the cache aspect. It is not necessarily the same as the actual architecture because it is supplemented by the author.
Some data:
- The cache is divided into page cache, fragment cache, row cache, vector cache, and cache hit rate.
- Fragment cache stores data in various API request formats, including XML, JSON, RSS, and atom.
- Posting tweets is first put into Kestrel and then asynchronously processed. Kestrel uses the memcached protocol.
- API requests: 550 R/S.
- Post tweets: peak value: 80 tweets/s at ordinary times. When Obama takes office, he reaches 350 tweets/s.
- The aggregator module needs to access memcached multi get several hundred/s.
- Varnish is also used as the front-end reverse proxy before Ruby on Rails.
References:
- Qcon London 2009: upgrading Twitter without service disruptions
- Improving running components at Twitter (PDF slide)