This article has been included in the "Large Web site Technology Architecture" Reading notes series directory stickers, click to access the directory for more content.
One, scalable and extensible-silly division is not clear
Previous notes we learned about scalable architectures, but in real-world scenarios, including many architects, are often confused with scalability and extensibility, with extensible representation of scalability. So here, follow the author we have to clarify these two concepts, to avoid the future we will not be confused about it.
(1) extensibility (extensibiltiy)
refers to the ability of the system function to be continuously expanded or enhanced in the case of minimal impact on existing systems . We can't help but think of one of the main principles of object-oriented: Opening and shutting, open to expansion, closed to modification. It also says that when a new feature is added to the system, there is no need to modify the structure and code of the existing system.
(2) flexibility (Scalability)
refers to the ability of a system to increase (or reduce) its own computational transactions by increasing (or decreasing) the size of its own resources . In the Web site architecture, usually refers to the use of clustering to increase the number of servers, thereby increasing the overall transaction throughput capacity of the system.
The core idea of designing a Web site Extensible architecture is: modularization , and on this basis to reduce the coupling between the modules, improve the reusability of the module. In a large web site, these modules are distributed and deployed in a way that separate modules are deployed on separate servers (clusters) to physically decouple the coupling between modules, further reducing coupling and thereby improving reusability.
Second, the use of distributed Message Queuing to reduce system coupling
Above, we mentioned that to separate the coupling between the modules, if there is no direct call between the modules, then the new module or modify the module has the least impact on other modules, so that the system scalability is undoubtedly better. So, is there an architecture that is based on such considerations? So we turn our eyes to an architecture called " event-driven ".
2.1 Event-Driven architecture
Based on the definition of event-driven architecture (events driven Architecture): by transferring messages between low-coupling modules to keep the modules loosely coupled, and by communicating with event messages to complete the inter-module collaboration . The typical EDA architecture is the common producer- consumer model in the operating system. In the large-scale Web site architecture, there are many implementations, but the most common is distributed Message Queuing .
As shown, Message Queuing works by using the publish-subscribe pattern, the message sender publishes the message, and one or more message recipients subscribe to the message. A message sender is a message source that is sent to a distributed message queue after the message is processed, and the message receiver continues processing after it obtains the message from the distributed message queue. It can be seen clearly that there is no direct coupling between the sender and the recipient, and the message sender simply sends the message to the distributed message queue as the end of the operation, and the message recipient only needs to get the message from the distributed message queue and does not need to know where the message came from. Therefore, for the new business, as long as the interest in such messages, you can subscribe to the message, the original system and business has no impact, so as to achieve the extensible design of the website business .
2.2 Distributed Message Queuing
A queue is a first-in-one-out data structure, and distributed Message Queuing is seen as the deployment of this data structure to a stand-alone server , where the application sees the use of distributed Message Queuing through the remote access interface for message Access operations to implement distributed asynchronous calls.
As shown, we can identify three steps:
The ① message producer application pushes the message to the Message Queuing server through the remote provider, and the Message Queuing server writes the message to the local memory queue and returns a successful response to the message producer.
② the Message Queuing server finds the consumer application that subscribes to the message based on the message subscription list, and sends the message in the message queue to the consumer application via the remote communication interface in accordance with the FIFO principle;
③ The consumer application receives a push-over message and carries out a series of related processing, the process terminates;
PS: So, is there a situation where the Message Queuing server causes the message to be lost after it goes down. In fact, this situation does exist in the actual operation and maintenance process. So, how do we avoid it? At this point, the author gives a scenario: If a Message Queuing server goes down causing a message to be lost, messages that are successfully sent to Message Queuing are stored in the message producer server, and the message is actually processed by the message consumer server before the message is deleted . After the Message Queuing server goes down, the producer server chooses to publish messages from other servers in the distributed Message Queuing server cluster.
In addition, there are practices on distributed Message Queuing that can be built using NoSQL products, such as Redis, which provides a queue data type, which makes it easy to build distributed message queues, and if you are interested, you can also refer to my other blog post: "Using Redis as a Message Queuing service scenario case"
Third, the use of distributed services to create a reusable business platform
If distributed Message Queuing decomposes the system coupling through the message object , the different subsystems process the same message, then the distributed service decomposes The system coupling through the interface, and the different subsystems make the service invocation through the same interface description.
3.1 The problems of the Big Mac application System
The evolution of the website from small to large, performance for the entire site is a single system gradually expanded development and change, with the increasing complexity of website function, the website application system will gradually become a Big Mac, as shown. As you can see, a large number of applications and service components are aggregated in an application, and the Big Mac is a huge hassle for the development of the entire Web site (difficulty compiling, Code Branch Management), maintenance (new business difficulties), and deployment (deployment difficulties).
3.2 Split, split or split
The solution is also the split we have mentioned many times, and the modules are deployed independently to reduce system coupling. Split is divided into: horizontal split and vertical split. Here we look again at both of these ways:
(1) Vertical splitting: A large application is split into several small applications, if the new business is more independent, then the design is directly deployed as a separate Web application system;
(2) Horizontal split: The reusable business can be split out, independently deployed as a distributed service, the new business only need to call these distributed services, do not need to rely on the specific module code. If the business logic within the module changes, the business program and other modules will not be affected as long as the interfaces remain consistent.
Iv. Scalable Data Structures
In order to guarantee the correctness of relational operation (through SQL statements), the traditional relational database needs to make the table's schema-field name, data type and so on when designing the table structure, and follow the design paradigm (for example: 1NF, 2NF, 3NF, etc.). One of the problems with these specifications is that stiff data structures are difficult to meet the challenges of changing requirements , and some system designers are able to cope by pre-designing some redundant fields (in my year of internship, I've seen this design many times, although it can solve the problem, but from the design, Really good shit), but this is obviously a bad database design.
So, is there a way to do scalable data structure design with wood? Is it possible to add fields without modifying the table structure? The answer is yes, the columnfamily( column family ) design that many NoSQL databases use today is a solution. Columnfamily was first used in Google's bigtable, a sparse matrix storage format for column families. It may be said that people still do not understand, but can be understood by:
This is a Student basic information table, different students contact different, elective courses are different, and in the future there will be more contacts and courses to join this form, if according to the traditional database design, regardless of the pre-set how many redundant fields are not enough, stretched, tired to cope. Instead of using a columnfamily-structured nosql database, when creating a table, you only need to specify Columnfamily's name, without specifying a field (Column), which you can specify when the data is written, in this way, A data table can contain millions of of fields, allowing the application's data structure to be arbitrarily extensible .
Five, the use of open platform to build the website ecological Circle
The value of the site is to create value for his users, large-scale web site in order to better serve their users, will develop more value-added services, will be the site's internal services encapsulated some of the calling interface open for external third-party developers to use, this open interface platform is called an open platform. Third-party developers use these open interfaces to develop applications (apps) or websites that provide value to more users. As a result, Web sites, users, and third-party developers rely on each other to form an ecosystem of sites that provide more value to users and improve the competitiveness and profitability of websites and third-party developers.
BAT and other domestic internet giants are now building their own open platform, trying to use their huge user base to attract third-party developers, to create a larger aircraft carrier battle Group, in the market competition in an invincible position.
Vi. Summary of Studies
Web site constantly on the new product is its survival instinct, who can be faster and better exit more new products, who will live more moist. Marx's theory of labor value in the IT industry has been confirmed that the intrinsic value of the product lies in the time of labor, and labor time is not the individual pay the labor time, but in the industry generally working time, the capitalist will only for the industry general Labor time to pay, if you are less efficient than the industry general time, sorry , please use your resources to work overtime.
I think of me in the cdeic internship this year, add a lot of classes, now it seems that if our system has a more scalable system architecture, can be faster to quickly send new products, maybe we will be able to work on time, have a girlfriend to accompany the girlfriend to dinner to see movies and so on, Do not have a girlfriend to develop and develop a girlfriend or something (seemingly I am still single is too much overtime, no time to find a girlfriend?) , or read books, listen to songs, take a Walk, to the moon when the song, tell the Geometry of life.
Mind Map of this chapter
Zhou Xurong
Source: http://www.cnblogs.com/edisonchou/
The copyright of this article is owned by the author and the blog Park, welcome reprint, but without the consent of the author must retain this paragraph, and in the article page obvious location to give the original link.