Viewing the architecture technology through double-class project Practice
Every year "Double 11" is a power business event, consumer Carnival Day. This year's double 11 is particularly significant, and it has grown into a feast for the world's electricity dealers and consumers alike. And for the technical staff, double 11 has undoubtedly become a big test, considering the perspective of the overall architecture, basic middleware, operational tools, personnel and so on.
A successful preparation is not only for the activities themselves to the system and structure of the optimization measures, such as: flow control, caching strategy, rely on control, performance optimization ... But also with a long period of technical accumulation and polishing inseparable. Below I will briefly introduce the overall structure of Alipay, let everyone have a preliminary understanding, and then will be in the big promotion of "ant Flower Bai" as an example, the general introduction of a new business is how to start from scratch ready for the big promotion.
Architecture
Alipay's architectural design should take into account the characteristics of the Internet financial business, such as requiring higher business continuity, better scalability, faster support for new business development and so on. The current structure is as follows:
The entire platform is divided into three tiers:
Transport Sooth (IAAS): Mainly provide the basic resources of scalability, such as network, storage, database, virtualization, IDC, etc., to ensure the stability of the underlying system platform;
Technology platform (PAAS): Mainly provide scalable, highly available distributed transaction processing and service computing capabilities, to achieve flexible resource allocation and access control, provide a set of basic middleware operating environment, shielding the complexity of the underlying resources;
Business Platform (SAAS): Provides a high availability payment service anytime, anywhere, and provides a secure and easy-to-use open payment application development platform.
Schema attributes
Logical Data Center architecture
In the case of doubling the daily business volume of the two-year-old, Alipay is facing the test is more and more big: the capacity of the system more and more, servers, networks, databases, computer rooms have been expanded, which brought some relatively large problems, such as the system is growing, the complexity of the system more and more high, Prior to the point of the scalable architecture does not meet the requirements, we need a set of integrated scalable solution, can be in accordance with the dimensions of a unit to expand. Ability to support offsite scaling, provide n+1 disaster preparedness solutions, and provide a holistic recovery system. Based on the above requirements, we put forward the logical data Center architecture, the core idea is to the level of data splitting into the upper layer, the terminal, from the access layer began to divide the system into several units, the unit has several characteristics:
Each unit is closed externally, including the exchange of various types of storage between systems;
The real-time data for each unit is independent and not shared. But the member or the configuration class to the delay sex request data can share;
Unit communication between the unified control, as far as possible to go asynchronous message. Synchronous message-taking unit agent scheme;
The following is the concept of Alipay Logic room Architecture:
This architecture addresses several key issues:
Offsite deployment is possible because of the minimization of cross cell interactions and the use of asynchrony. The level scalability of the whole system is greatly improved, no longer rely on the same city IDC;
It can realize n+1 disaster preparedness strategy, greatly reduce the cost of disaster preparation and ensure the real availability of disaster preparedness facilities;
The whole system has no single point of existence, which greatly improves the overall high availability; Multiple units deployed in the same city and offsite can be used as a disaster-tolerant facility, through the operation and maintenance of the control platform for rapid switching, the opportunity to achieve 100% of the continuous availability rate;
The traffic inlet and outlet of the business level under this architecture form a unified controllable and routable control point, and the controllable ability of the whole system is greatly improved. Based on the framework, online pressure measurement, flow control, gray distribution and other previously difficult to achieve the operation of the control mode, can now be very easily realized.
At present, the new framework of the same city main frame in 2013 has been completed, and successfully faced the test of double 11, so that the entire framework of the landing work has been very good proof.
Completed in 2015, based on the logic room, offsite deployment of "many live" framework landing. "Remote More Live" framework is to mean, based on the logic room expansion capacity, in different regions IDC deployment logic room, and each logic room is "live", really take online business, in the event of a failure can quickly switch between the logic room.
This has better business continuity protection than the traditional "three-centre" architecture. In the "far-live" framework, an IDC corresponding failure disaster-tolerant IDC is a "live" IDC, usually to undertake the normal online business, to ensure its stability and the correctness of the business has been ensured.
The following is Alipay "live in a different location" structure diagram:
In addition to better fault response capabilities, based on the logic room we have the "blue Green release" or "grayscale release" of the verification capabilities. We have a single logic room (hereafter referred to as LDC) inside and divided into a, b two logic room, a, b computer room in full functional equivalence. In everyday situations, call requests are randomly routed to A or B according to the equivalent probability. When blue-green mode is turned on, the upper routing component adjusts the routing calculation policy, isolates the calls between A and B, and the application in Group A is accessible only to each other without access to Group B.
Then the blue-green release process is roughly as follows:
Step1. Before the release, the "Blue" flow to 0%, the "blue" of all the application of the overall disorder in 2 groups released.
Step2. "Blue" drainage 1% observation, if no abnormalities, gradually increase the proportion of shunt to 100%.
Step3. "Green" flow of 0%, "green" all the application of the overall disorder in 2 groups released.
Step4. Restore the day-to-day operation of the state, the blue and green units are responsible for 50% of the line of business traffic.
Distributed Data architecture
Alipay handles payment peaks of 85,900 pens/sec During the peak period of the 2015-year double 11 day, already the world's largest system payment. Alipay is already one of the world's largest OLTP handlers, the sensitivity to transactions makes Alipay's data architecture different from that of other Internet companies, but it inherits the enormous amount of users unique to internet companies, most notably Alipay, which is more sensitive to transaction costs than traditional financial firms, so Alipay's data architecture is a low-cost , linear scalable, distributed data architecture evolution history.
Now Alipay's data architecture has been upgraded from centralized minicomputer and high-end storage to distributed PC service solutions, and the overall data architecture solution is as free as vendor-dependent and standardized.
Alipay Distributed Data architecture scalable strategies are mainly divided into three dimensions:
Vertical split by Business type
Horizontal split (that is, the sharding policy of the usual data) according to the customer's request
Read-write separation and data copy processing for reading far greater than written data
The following figure is the scalable design of Alipay's internal transaction data:
Trading system data are mainly divided into three large database clusters:
Master transaction database cluster, each transaction creation and state modification first completed in this ⾥, the resulting changes are replicated through the reliable data replication Center to the other two database clusters: Consumption record database cluster, merchant query database cluster. The data of the database cluster is split horizontally, in order to ensure scalability and high reliability at the same time, each node will have the corresponding standby node and failover node, in the time of failure can be switched to the failover node in the second level.
Consumer record database cluster, providing consumers with better user experience and needs;