Sina Weibo platform architecture for billions of users

Source: Internet
Author: User
Tags webp

Preface

Sina Weibo announced in March 2014 the monthly active users (MAU) has reached 143 million, the first minute of the New year in 2014 to send 808,298 micro-blog, so large user size and volume of business, the need for high availability (HA), high concurrent access, low latency powerful back-end system support.

The first generation of the Weibo platform is the lamp architecture, the database uses the MyISAM, the back-end PHP, and the cache is memcache.

As the scale of application grows, the second generation of architectures is modular, serviced, and modularized, and the backend system is replaced with Java from PHP, which gradually forms a service-oriented SOA architecture, which supports the development of microblogging platform business for a long time.

On this basis, after a long period of reconstruction, online operation, thinking and precipitation, the platform formed a third-generation architecture system.

Let's take a look at the core business diagram of Weibo (below), is not very complex, but this is a simplified business diagram can not be simplified, the third-generation technology system is to ensure that the core business of micro-bo fast, efficient and reliable release of new product features.

650) this.width=650; "Src=" http://mmbiz.qpic.cn/mmbiz/ Xruic8oiyw5ty2j8ufa8ibcs8yy2yaw06ico3ur4wicd1h7twkdnxiarxxzrsmwqkaxiv39hxwnfpviaavrklda0toow/640?wx_fmt=jpeg &tp=webp&wxfrom=5&wx_lazy=1 "style=" margin:auto;padding:0px;height:auto;border:0px;font-size:0px; Vertical-align:middle;clear:both;width:auto; "alt=" 640?wx_fmt=jpeg&tp=webp&wxfrom=5&wx_lazy "/>

Third generation technology system

The third-generation technology system of Weibo platform, using the orthogonal decomposition method to establish a model, in the horizontal direction, using a typical three-level hierarchical model, namely interface layer, service layer and resource layer, in the vertical direction, further subdivided into business architecture, technology architecture, monitoring platform and service governance platform, and then look at the platform's overall architecture diagram.

650) this.width=650; "Src=" http://mmbiz.qpic.cn/mmbiz/ Xruic8oiyw5ty2j8ufa8ibcs8yy2yaw06icwbxvwcic1tzrwpjtijpk6oapsruvo5kmbunghnc21pr2l3rccswlr7q/640?wx_fmt=jpeg &tp=webp&wxfrom=5&wx_lazy=1 "style=" margin:auto;padding:0px;height:auto;border:0px;font-size:0px; Vertical-align:middle;clear:both;width:auto; "alt=" 640?wx_fmt=jpeg&tp=webp&wxfrom=5&wx_lazy "/>


As shown, the orthogonal decomposition method to decompose the entire graph into 3*4=12 regions, each region represents a horizontal dimension and a vertical dimension of the intersection, the corresponding definition of the region's core function points, such as Region 5 mainly to complete the service layer of the technical framework, the following details the horizontal direction and vertical direction of the design principles, In particular, it focuses on the technical components in 4, 5, and 6 and their role in the entire architecture system.

Horizontal layering

Horizontal dimension of the division, in the design of large and medium-sized Internet backend business system is very basic, in the platform of each generation of technology system are reflected, here or a brief introduction, for the following vertical dimension of the extension of the explanation to do the foreshadowing:

    1. The interface layer mainly realizes the interface interaction with the Web page, the mobile client, defines the unified interface specification, the platform core three interface service is the content (Feed) service, the user relation service and the Communication Service (single private messages, Mass, group chat) respectively.

    2. Service layer is the core business of the modular, service, and here are divided into two types of services, a class of atomic services, defined as a service module that does not rely on any other services, such as the commonly used short-chain service, the service of the number of services belong to this category, the diagram uses lane isolation, the other class for the combination of services, Through a combination of various atomic services and business logic, composite services, such as feed services and communication services, are also dependent on short-chain, user, and generator services, in addition to their own business logic.

    3. The resource-tier primary data model contains generic cache resource Redis and MC, as well as persistent database storage for MySQL, HBase, or Distributed File System TFS and Sina S3 services.

Horizontal stratification has a feature, the dependency is from the top down, the upper layer of service depends on the lower layer, the lower level of service will not rely on the upper layer, to build a simple and direct dependency relationship.

corresponding to the layered model, the server in the microblog system mainly includes three types: Front end machine (providing API interface service), queue machine (processing upstream business logic, mainly data writing), Storage (MC, MySQL, McQ, Redis, hbase, etc.).

Vertical Extension Technology Architecture

With the development and optimization of the business architecture, platform development realizes many excellent middleware products, which are used to support the core business, which is generated by the business, and as the technical components become more and more rich, form a complete platform technology framework, greatly improve the platform's product development efficiency and business stability.

Different from the level of the upper layer dependent on the relationship between the vertical direction of the technical framework as the Foundation support point, driving to both sides of the business structure, monitoring platform, service governance platform, the following is the core components.

Interface Layer Web V4 framework

The interface framework simplifies and regulates the development of business interfaces, packages common interface layer functionality into a framework, and uses spring's aspect-oriented (AOP) design philosophy. The interface framework is based on Jersey for two development, based on annotation definition interface (URL, parameter), built-in auth, frequency control, access log, downgrade function, support interface layer monitoring platform and service governance, as well as automated bean-json/xml serialization.

Service Layer Framework

The service layer mainly involves the RPC remote call framework and the Message Queue framework, which is the two most widely used platform in the service layer.

MCQ Message Queuing

Message Queuing provides a first-in, first-out communication mechanism, within the platform, the most common scenario is to write the data landing operations asynchronously to the queue, the queue handler bulk read and write to the DB, Message Queuing provides an asynchronous mechanism to speed up the response time of the front-end machine, and second, the batch of DB operations indirectly improve the performance of the DB Another application scenario, the platform provides real-time data to search, big data, and business operations through Message Queuing.

The MCQ (simplequeue service over Memcache) Message Queuing service, which is used extensively within the microblogging platform, is based on the Memcache protocol, which persists the message data to BerkeleyDB, only get/set two commands, It's also very easy to do surveillance (stats queue), a rich client library that runs on the line for many years, with performance much higher than the generic MQ.

Motan RPC Framework

Motan RPC Service, the underlying communication engine uses the Netty Network framework, the serialization protocol supports Hessian and Java serialization, the communication protocol supports Motan, HTTP, TCP, MC, etc., Motan framework is used internally, in the robustness of the system and service governance , there are more mature technology solutions, robustness, configuration management service based on config implementation of the high availability and load balance policy (support flexible failover and failfast ha policy, and round Robin, LRU , consistent hash and other load balance policies), service governance, generate complete service invocation chain data, service request performance data, response time (Response times), QPS, and standardized error, exception log information.

Resource-level framework

There are a lot of frameworks for the resource layer, there are key-list dal middleware that encapsulates MySQL and HBase, a custom counting component, and a proxy that supports distributed MC and Redis, in which the industry has more experience to share, and here I share the Platform Architecture Object Library and SSD Cache component.

Object Library

The object library supports easy serialization and deserialization of object data in the microblog, serializes objects in the JVM memory into HBase and generates a unique objectid, when access to the object is accessed through Objectid, the object library supports any type of object, supports PB, JSON, binary serialization protocol, the largest application scenario in Weibo defines the video, pictures, and articles referenced in Weibo as objects, altogether defines dozens of object types, and abstracts out the standard object metadata schema, which is uploaded to the object storage System (Sina S3). The object metadata is saved in Sina S3.

Ssdcache

With the popularity of SSD drive, its superior IO performance has been replaced by more and more traditional SATA and SAS disk, there are three kinds of common application scenarios: 1) Replace the hard disk of MySQL database, the community does not have the MySQL version optimized for SSD, even so, Direct upgrade of SSD drives can also bring about 8 times times the ioPS boost, 2) Replace the Redis hard drive, improve its performance, 3) in the CDN, speed up static resource loading.

Micro-Bo platform to the application of SSD in the distributed cache scene, the traditional REDIS/MC + MySQL mode, extended to REDIS/MC + SSD cache + MySQL mode, SSD cache as L2 cache use, the first to reduce the Mc/redis cost is too high, The problem of small capacity also solves the database access pressure caused by the penetration of DB.

Vertical Monitoring and service governance

As service size and business become more complex, and even business architects can hardly describe the dependencies between services accurately, the management operations of services become more difficult, in this context, referring to Google's dapper and Twitter Zipkin, The platform realizes its own large-scale distributed tracking system Watchman.

Watchman large distributed Tracking system

Like other large and medium-sized Internet applications, the microblogging platform consists of a number of distributed components, the user through the browser or mobile client every HTTP request to reach the application server, will go through many business systems or system components, and leave footprints (footprint). But these scattered data can be of limited help in troubleshooting, or process optimization. For such a typical cross-process/cross-threading scenario, it is particularly important to aggregate and analyze such logs. On the other hand, the collection of performance data for each footprint (footprint) and the flow control or demotion of each subsystem according to the strategy are important factors to ensure high availability of the microblog platform. To be able to track the full invocation link for each request, collect performance data for each service on the call chain, track all error and exception in the system, and then return to the control flow by computing performance data and performance metrics (SLAs) Based on these goals, the watchman system of Weibo was born.

A core principle of its system design is low intrusion (NON-INVASIVENSS): As a non-business component, it should be as little as possible to invade or not invade other business systems, maintain the transparency of users, can greatly reduce the burden of developers and access to the threshold. Based on this consideration, all log acquisition points are distributed in the technical framework middleware, including interface framework, RPC framework, and other resource middleware.

Watchman by the technical team to build a framework, application in all business scenarios, operation and maintenance based on this system to improve the monitoring platform, business and operation of the common use of this system, the completion of distributed service management, including service expansion and contraction, service degradation, traffic switching, Service release and grayscale.

End

Now, the technical framework is playing an increasingly important role in the platform, driving the platform of Technology upgrading, business development, system operation and maintenance services, this article is limited to space limitations, no introduction, follow-up will continue to introduce the core middleware design principles and system architecture.


For more information, please follow the public number: It_haha

This article is from the "djh01" blog, make sure to keep this source http://djh01.blog.51cto.com/10177066/1811653

Sina Weibo platform architecture for billions of users

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.