WikiPedia Technical Architecture learning and sharing (Introduction)

Source: Internet
Author: User
Tags mediawiki wiki software

Wikipedia ( ranks eighth among the top 10 websites in the world. This is the power of openness.

Point direct data:

  • The peak value is 30 thousand "> HTTPRequest
  • 3 GB per secondBitTraffic, almost375 MB
  • 350 PCsServer (Data Source)


The architecture is as follows:
Copy @ Mark Bergsma


In my Blog on these website architectures, what is the first appearance of GeoDNS? "A 40-line patch for BIND to add geographical filters support to the existent views in BIND" to bring the user to the nearest server. GeoDNS's responsibility in the WikiPedia architecture is, of course, determined by the content nature of WikiPedia-oriented to various countries and regions.

Server Load balancer: LVS

WikiPedia uses LVS for load balancing. It is a project initiated by Dr. Zhang wenxiao and is proud of the few Chinese people in the Open Source Field. An old LVS maintenance problem is monitoring. Wikipedia technicians use pybal.

Image server: Lighttpd

Lighttpd is now configured as a quasi-standard image server. Not much.

Wiki software: MediaWiki

The MediaWiki application layer is optimized to the extreme. Use a relatively small overhead method to locate code hotspots. For more information, see real-time performance reports. See the figure tree to see where the bottleneck is. Another important experience is to discard complex algorithms, expensive queries, and MediaWiki features that may bring excessive overhead.

Cache! Cache! Cache!

The first key factor for successful Wikipedia websites is Cache. CDN (actually called Cache) distributes content to different continents and uses Squid as the reverse proxy. The database Cache uses Memcached and 30 servers, each of which is 2 GB. Cache is used as much as possible for all possible data, but they also remind that the Cache overhead is not always the smallest. It is used as much as possible, but cannot be used excessively.

Database: MySQL

The DB used by MediaWiki is a common extension solution of MySQL. MySQL in Web 2.0 technology. They are also using it. Replication, read/write splitting... load balancing of applications on the DB is achieved through LoadBalancer. php, which can be a good reference.

For websites like this, WikiPedia spends $2 million a year, with only 6 technicians, which is incredibly efficient.


Wikimedia architecture (PDF)
Todd Hoff's article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.