Learn about the technologies behind Instagram

Source: Internet
Author: User
Tags elastic load balancer

Address: http://www.infoq.com/cn/news/2012/05/instagram

Instagram, a famous mobile phone photo sharing app that Facebook acquired for $1 billion, has recently attracted numerous people's attention. In less than a month, the Android version has downloaded more than 10 million downloads and the total number of users is about to exceed 50 million. Mike Krieger, co-founder of Instagram, said they spent eight weeks building their First instagram, but the current system is certainly not the same. The Instagram technology team once published an article about the technology behind Instagram. Mike Krieger recently introduced more details in his speech titled scaling Instagram, let people know how five technicians supported the entire system.

The process of uploading a photo is as follows:

    1. Write Data to the Media Database in synchronous Mode
    2. If a photo contains a location tag, the photo is submitted to SOLR asynchronously for indexing.
    3. Add the photo ID to the list of each recipient and save the list in redis.
    4. When the feed is displayed, select a small part of the photo ID and query it in memcached.

In the design system, the design philosophy of Instagram is simple. It optimizes and monitors all content to minimize the O & M burden. Its core principle is to keep it simple and never reinvent the wheel, use proven, stable, and reliable technologies as much as possible.

Because there are only five technicians (only 2.5 backend engineers) and limited energy, choosing Amazon's cloud service is a good choice. Currently, they use more than 100EC2The instance is used to provide various services. The operating system running is Ubuntu 11.04. Some previous versions are not stable enough for high traffic. In terms of load balancing, they use Amazon'sElastic Load balancerLoad Balancing is implemented. Three nginx instances are run at the backend, and SSL is only available on ELB, reducing the CPU load on nginx. DNS and CDN areRoute 53AndCloudfrontProvided, all photos are stored inS3At present, there are several terabytes of data.

The application server used to process the request runs onAmazon high-CPU extra-large instanceBecause their requests are more CPU-intensive, this can better balance the CPU and memory. The development framework is Django and the wsgi server is gunicorn. It takes only a few seconds to deploy fabric on all machines in parallel.

Most data is stored in PostgreSQL, and the master sharding cluster runs on 12High-memory quadruple extra-large instance(GB memory), and 12 copies located in different zones, are synchronized through repmgr in streaming replication mode. BecauseElastic Block StoreThe disk iops is not high, so you need to load all the data in use into the memory, vmtouch can help manage the data in the memory. They use mdadm on EBS to implement software raid to improve write throughput. The database file system uses XFS to freeze raid arrays when obtaining snapshots from the database, ensure snapshot consistency.

ApplicationProgramWhen connecting to the database, pgbouncer establishes a connection pool. Currently, Instagram data is sharded by user ID. Some shards may exceed the capacity limit of physical nodes. Therefore, they divide the data into multiple logical shards and map them to a few physical nodes; when a node is filled up, some logical parts can be moved to another node to relieve the pressure on the node. As the data volume increases, vertical partitioning will be performed in the future. Django dB router makes everything easier.

Instagram also uses redis extensively to store complex objects (the object size is limited) for primary feed, active feed, session system, and other related systems. Because all redis data needs to be stored in the memory, it is also used hereHigh-memory quadruple extra-large instanceAnd partition the data. When the request of the redis instance reaches 40 thousand/second, it gradually becomes a bottleneck, so redis also performs master-slave replication, and the data of the copy is often exported to the disk, back up data using EBS snapshots.

In addition to redis, they also use memcached for caching. Currently, six instances are running, and the application server is connected through pylibmc and libmemcached. Although amazon provides the elastic cache service, it is not cheap. In contrast, it is more cost-effective to run your own memcached instance. The asynchronous task queue uses gearman, which currently has about 200 worker processes to process various tasks, such as sharing photos to Twitter and Facebook, and notifying users of new photos. Pyapns has processed billions of push notifications, which are very stable. They also developed node. js-based node2dm to send push notifications to Android devices.

In terms of monitoring, Instagram uses Munin to display the running status of the entire system in a graphical manner. It also uses Python-Munin to customize some plug-ins to display business data; the network daemon stated can collect and summarize data in real time. dogslow monitors the process. Once a process that has been running for a long time is found, snapshots of the process are saved for subsequent analysis, for example, requests whose response time exceeds 1.5 seconds are usually stored in the Set () and get_many () Methods of memcached. For Python errors, you only need to mount sentry to obtain the error information in real time.

At highscalability, we also sorted out some useful experiences based on Mike Krieger's speech, such:

    • Find the technologies and tools you are familiar with and try them in simple use cases.
    • Do not use two tools to process the same task
    • Prepare a downgrade plan in advance to reduce the load as needed
    • Don't over-optimize it, or you may want to know in advance that the site needs to be expanded. For a start-up social website, there is no scalability problem that cannot be solved.
    • If one method fails, change the next one.

If you want to learn more about the technical details behind Instagram, visit the blog of its technical team.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.