[Architecture] instagram

Source: Internet
Author: User
Tags elastic load balancer

 

# Design Philosophy

Simple: optimizes O & M workload to minimize and monitors all content;

 

# Core Principles

Keep it simple, do not reinvent the wheel, try to use proven, stable and reliable technology

 

# Instagram Workflow

-Write Data to the Media Database in synchronous Mode

-If a photo contains a location tag, the photo is submitted to SOLR asynchronously for indexing.

-Add the photo ID to the list of each creator, Which is saved in redis.

-When the feed is displayed, select a small part of the photo ID and query it in memcached.

 

# General

-Extensive unit-tests and functional tests

-Keep it dry

-Loose coupling using nifications/Signals

-Do most of our work in Python, drop to C when necessary

-Frequent code reviews, pull requests to keep things in "shared brain"

-Extensive Monitoring

 

# Architecture

-Amazon Web Service

-AWS EC2: Ubuntu 11.04

-Load Balance: Amazon Elastic Load balancer;

Three nginx instances are run at the backend. SSL is only available on ELB, reducing the CPU load on nginx.

-DNS/CDN: AmazonRoute 53/Cloudfront

-Pictore storage: AWS S3

-DB: PostgreSQL

 

# Application Server

-AWS EC2: Amazon high-CPU extra-large instance

-Web Framework: Django

-Wsgi server: gunicorn

-Parallel deployment: Fabric

 

# Redis & memcached

Redis is widely used to store complex objects (the object size is limited) for primary feed, active feed, session system, and other related systems. Because all redis data is stored in the memory, the high-memory quadruple extra-large instance is also used here, and the data is sliced. When the request of the redis instance reaches 40 thousand/second, it gradually becomes a bottleneck, so redis also performs master-slave replication, and the data of the copy is often exported to the disk, back up data using EBS snapshots.

In addition to redis, they also use memcached for caching. Currently, six instances are running, and the application server is connected through pylibmc and libmemcached. Although amazon provides the elastic cache service, it is not cheap. In contrast, it is more cost-effective to run your own memcached instance.

 

# Task queue

The asynchronous task queue uses gearman, which currently has about 200 worker processes to process various tasks, such as sharing photos to Twitter and Facebook, and notifying users of new photos.

 

# Push notifications

Pyapns has handled billions of push notifications, which is very stable. They also developed node. js-based node2dm to send push notifications to Android devices;

 

# Monitoring

-The Munin and Python-Munin graphical display of the system status;

-The network daemon stated can collect and summarize data in real time.

-Dogslow monitors the process. Once a process that has been running for a long time is found, the snapshot of the process is saved for subsequent analysis. For example, if the response time for a request exceeds 1.5 seconds, it is usually stuck in the Set () and get_many () Methods of memcached.

-For Python errors, you only need to mount sentry to obtain the error information in real time.

 

# References

[Instagram Engineering] (http://instagram-engineering.tumblr.com/rss)

[Understand the technology behind Instagram] (http://www.cnblogs.com/piaoger/admin/EditPosts.aspx? Postid = 2520969)

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.