Brief Introduction to the architecture of the album website

Source: Internet
Author: User
Tags database sharding

Our team is developing a photo album product and hopes to create a storage-centered photo album service.

The basic architecture of the product is as follows:

1. Load Balancing LVS + keepalived

● High Load Resistance)

The logic of LVS is very simple, and it works only for request Distribution on Layer 4 of the network, with no traffic forwarding. It has higher concurrency capability than nginx, and the default configuration can support up to 0.1 million concurrency.

● Scalability)

When the service load increases, in order to achieve higher throughput, real-servers are added in LVS to meet the demand. The overhead is only linear growth and does not reduce the service quality.

● High Availability)

Keepalived can be used to check the health of server pool objects and automatically fail to switch between Server Load balancer instances (Failover) to ensure the high availability of LVS Server Load balancer itself. If a real-server in LVS stops the service due to upgrade or other reasons, this does not cause the entire LVS to interrupt the client service. (For example, online upgrade of nginx)

● Low cost (low cost)

Purchasing F5 BIG-IP, netscalar, and other hardware Load Balancing switches requires tens of thousands or even hundreds of thousands of yuan. The most basic requirement based on LVS is two common servers.

★How to keep the session?

To ensure server scalability, our servers are stateless, that is, any web server can be replaced by other Web servers.

We use an encrypted cookie-based session to maintain the user's access status. (Of course, we can also use LVS's session persistence mechanism to solve this problem .)

2. RPC mechanism: Ice + protobuf + zookeeper

● Ice supports multiple platforms, cross-language (language neutral), and mainstream languages, such as Java, C ++, and python. Supports multiple protocols (TCP/UDP/SSL), built-in cluster management, asynchronous and synchronous support, and complete documentation. Restrictions on the Use of ice: a null object is returned because the language is neutral and does not accept the direct return of null values. Method Overloading is not supported, let's start another method name (^_^). The runtime exception thrown by the ice server cannot be caught by the client. You can use declarative exception or custom return codemsg to solve this problem.

● The Zookeeper cluster is a decentralized distributed cluster with a watcher mechanism to monitor data changes. The service can be configured and registered as a distributed service.

● Protobuf can compress the data volume, efficiently encode and decode the data, and reduce the memory usage on the network transmission or cache layer. Protobuf native does not support map and requires custom struct for implementation. thrfit supports complex structures such as map.

★Why is it ice instead of thrift?

Why is it ice instead of thrift, because our team is familiar with ice and has success stories.

Thrift will be considered for future improvements, which can be seamlessly integrated with tornado, making good use of Tornado's asynchronous features.

3. Image Processing Service: Python + tornado

Considering Java's low efficiency in image compression, we use Python's Pil (Python imaging Library) to compress images.

Tornado is an open-source Web framework used by Facebook to process friendfeed. It is a non-blocking server and is fast. Tornado can process thousands of connections per second based on its non-blocking method and epoll application. Therefore, Tornado is an ideal framework for real-time web services.

4. Use the cache to improve system performance (saving every 1 ms)

● Dependency on tools and data, monitoring and measurement design.

(Use spring dynamic proxy to intercept logs, so as to monitor the time consumption of each method in the request processing process and identify the performance bottleneck)

● Use batch operations as much as possible to obtain all required data at a time.

(For example, the batch operation of database JDBC using the redis cache batch interface or pipeline .) On the page for retrieving the feed and album lists, You can merge hundreds of requests into several requests in batches to reduce the network I/O overhead from 1 second to 10 ms, performance has increased by 100 times.

● Split the cache granularity to achieve optimal cache utilization.

(Separate the image data from the browsing quantity of the image, and separate the album image ID list from the image object cache .)

● Only obtain the data to be presented.

(For example, to display the four pictures of the album on the album list page, you do not need to obtain the number of views of the four pictures to reduce unnecessary request and time overhead .)

● Short Circuit logic optimization.

When all requests hit the cache, data is obtained directly. If no data is obtained or the number of requests is smaller than the number of requests, Data Capturing is considered.

● Divide the operation Priority.

The primary operation takes priority. Secondary operations or long-time operations are sent to the backend for execution by sending message queues.

(For example, You need to accumulate the number of original images to be saved when you store images .)

● ImprovementProgramParallel parts.

According to Amdahl's law, concurrency performance is limited by the proportion that must be serialized. (For example, when an image is transferred to OSS, a thread pool is used to concurrently request cloud storage for Image File copying .)

● Create a multi-level cache.

The page layer can be divided into several JSON requests for segment caching, and the service layer can cache the organized view data. (Because the current performance is basically up to standard and the cache maintenance cost, this will be implemented at the end .)

After simple optimization. Basically, all requests are within 50 ms, and the 90% request response time is within 20 ms.

Summary: All the optimization measures mentioned above areCodeIn fact, optimization should first consider the product requirements, business processes, and system architecture, and finally optimize time-consuming operations that are frequently called based on the actual online monitoring logs.

★Redis

● Redis is not configured for persistence. redis persistence may cause 100% of the CPU in an instant, resulting in a high latency.

● The redis list object does not support LRU (or similar to the MongoDB caped collection feature), or you can use the redis. fpush self-implemented by huaban.

● In order to prevent the database from having no content, the redis cache layer is frequently penetrated for MySQL database queries. @ Xiao Jianshi invented emptyable_list. Even if the list returned by the database is empty, a flag is still created in redis, indicating that the data has been loaded from the database and no data needs to be queried.

● In order to avoid the avalanche effect when the cache fails, a distributed lock should be established during data loading (implemented using memcached or zookeeper ), or simply load the data to be displayed, the request thread that cannot obtain the lock can directly return the data. Even if the user does not access the data for the first time, it will be normal to click the page.

● We look forward to the cluster function of redis 3.0.

5. Database Design

User information is not table-specific

CRC32 hash database/table sharding based on user IDs.

★How can we solve the problem of uneven table data distribution caused by a large number of fans of star users?

After the user's fans are shard based on the user ID, the Shard is performed again. User-> user_followers_shard_map ---> user_followers_data.

For details, see Pinterest database sharding architecture.

6. Write maintainability code-keep the architecture simple

● Grasp the original object-oriented design principles.

10 object-oriented design principles http://www.iteye.com/news/24488 that Java programmers should understand

● Continuous refactoring to write maintainability code.

Team members conduct code-review on each other to reconstruct complicated and obscure logic.

● Testability.

Actively write reliable unit tests to effectively improve the quality of software.

I would like to thank @ Xiao Jianshi for promoting and sticking to the unit test work.

● Encoding specifications.

Brief method name, standard naming rules. Makes the code logic clearer and clearer, and the code is easier to understand. For more information, see Code Daquan, refactoring, and code cleansing.

Case:

● When the cache interaction logic of the original redis processing was coupled with the cache service logic, we separated it separately. Xxxredisdao is used to process the code that interacts with redis, which simplifies the class responsibilities. xxxcacheddao is only responsible for caching and the logic of the database.

● In the past, each ice service layer was originally written with a complex business view. For example, to display an image, the user nickname, profile picture, and other business logic of the image were required. After the transformation, the service layer only spit out the basic data of the business. For example, the feed only contains the photo_id field, and the final photo information needs to be displayed, by calling other services on the web layer for assembly, you can reuse or customize the business view based on different presentation needs.

● Use jafka message to distribute business event messages, removing strong coupling. For example, the initialization settings of the new user are as follows: upload an image to send a feed to a friend.

● Encapsulate many tool classes or internal classes and use the combination mode first to achieve componentization and modularization. The entire architecture is clearer and more reasonable.

7. Others

We use the snowflake opened by Twitter to generate UUID.

Supervisord is used to monitor tornado processes to ensure system stability and reliability.

Apsaradb for MongoDB is used for convenient and quick storage in the management backend or Plaza push.

The Web Front-end uses front-end JavaScript frameworks such as jquery, requirejs, and dot.

The JSON data returned by the front-end is not gzip at the moment. In addition to chrome, the browser decompress the data, but the DOM ready takes a longer time.

Use git for source code management and continuous integration and deployment based on Hudson + Maven.

Use Facebook's open scribe to collect logs.

Use open-source toolsGangliaAnd a self-developed support platform to monitor and analyze online services.

8. Advertisement

We are hiring. We welcome all Nb-shining UI designers, Web Front-end engineers, Java engineers, and Python engineers to join us!

E-mail: ryanchen@sohu-inc.com

Welcome to our product http://pp.sohu.com or your valuable comments.

References

LVS Load Balancing tutorial http:// OS .51cto.com/art/201202/319979.htm

Snowflake https://github.com/twitter/snowflake

Pinterest database sharding architecture http://www.slideshare.net/ebola9020/pinterest-14178300

10 object-oriented design principles http://www.iteye.com/news/24488 that Java programmers should understand

Jafka introduction http://www.blogjava.net/xylz/category/51811.html

Redis practices on Sina Weibo Open Platform

[Microarchitecture design] microblogs counter design (below) http://blog.cydu.net/2012/09/weibo-counter-service-design-2.html

Path to the architecture of huaban Network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.