The performance of a website is as worrying as Twitter. It is always helpless to see the big whale. Twitter O & M expert John Adams
In velocity 2009, I wrote an article titled fixing Twitter.
Technology sharing (PDF
. John Adams joined Twitter in July 2008 and did a lot of work on the Stability of Twitter sites.
Duties of the Twitter O & M team:
- Software Performance (backend) Software Performance (back-end)
- Availability
- Capacity planning capacity planning (metrics-driven)
- Configuration Management
After reading this PDF of nearly 50 pages
In addition to meeting the interests of a small part of our technical Snoop, we may also learn something.
Do not reinvent the wheel
For monitoring, Twitter uses rrdtool
, Ganglia
, MRTG
These have become the standard components for many websites. Instead of writing a lot of repetitive features on your own. It is worth noting that Twitter has been using Google Analytics for business analysis.
Without repeatedly inventing the wheel, you can polish the wheel, such as some functional script customization and other work.
Inventing non-repetitive wheels
Twitter open-source an Apache module called mod_memcache_block.
(A Distributed IP
Blocking System ).
Code requests limit the access frequency. Friends familiar with Twitter will know that this is a required function for third-party applications. Otherwise, it will produce DDOS-like effects.
John Adams said that this module is something he has been looking forward to for many years. I believe that if someone has done the same thing, they will not write another one by themselves.
Automation as much as possible
Whether it is configuration management or the "Switch" for various functions, it should be automated as much as possible. Relying on people to control some things is easy to "Standardize", but the process is redundant and the pace is slow.
Better understanding of hardware
Embrace the new technology system and use more cost-effective hardware (such as 8-core CPU
) Will bring better benefits. This should be based on a correct understanding of the hardware system.
Remember the following words:
- Disk is the new tape. (memory is a new type of disk. disk is a new type of tape)
- Kill long running queries before they kill you? Effective monitoring !)
- Use metrics to make decisions, not guesses.
- "Cache everything! "Not the best policy
Maybe you should learn more...
-- EOF --