Taobao platform Architects talk about mass Internet service technology architecture

Source: Internet
Author: User
Lin Hao went, network name Bluedavy,china OSGi User Group Director, Taobao Platform Architecture Division architect, the individual research direction mainly for Java modular, dynamic system construction and high-performance Large-scale distributed Java system construction. I have written "OSGi combat" and "OSGi advanced" two OpenDoc, for the spread of OSGi in China played a great role. 1 U: @% ~8 @* M2 \7 @4 L7 G
Wang: Data cluster Problem: When the data grows to a certain order of magnitude, it must be distributed deployment, backup, disaster tolerance, cutting capacity and other work. Ask what degree of order of magnitude needs distribution deployment, how to distribute deployment reasonably, what needs to consider.
Lin Hao went: Generally, there is no fixed order of magnitude, usually based on the status of hardware resources and acceptable performance conditions (such as a query must be completed within 3MS) to determine. When the performance bottleneck is reached, a strategy that typically requires data splitting or backup, and the most to consider in this process is the impact on the application, so a strong, transparent data layer is often needed to mask the impact of data splitting or backup, and migration operations on the application. On the other hand is to be able to do as far as possible without downtime completed. Of course, this is difficult because of the need to face multiple sets of data structure coexistence, redundancy and synchronization problems.
Wang: Data backup problem: For large-capacity data backup, the technical how to do not affect the normal service. How to rationally formulate the implementation strategy, mode and time period of Lengbei and hot standby. In case of data corruption, primary server hardware damage, how to monitor the fault and dispatch the request to the backup server in the shortest time.
Lin Hao went: for large-capacity data backups, technically: In most cases, it is better to choose an asynchronous message notification to implement a data backup, or a feature based on a high-end database (such as Oracle's standby). For Lengbei, hot spare implementation, the principle requirements are not to affect the normal business functions, so the optional time period can only be the period of low system access. The method depends on the amount of data and the speed of the backup. Most of them take the relatively high frequency of hot standby, low frequency of the cold standby; In the case of data corruption, primary server hardware damage, and so on, to do as soon as possible, you must rely on a strong timely monitoring system, in the primary server can be done quickly alarm. The ideal situation is the ability to have a mechanism to automatically switch the standby library to the main library, and notify all applications to switch to connect and use the new main library, if not automatically, this process will still have to be based on "human flesh" to operate.
Wang: Open platform Design problem: In open platform API design, what are the considerations for calling protocol design? For the invocation protocol design of the request class, tends to call? A=a&b=b this way (this approach is convenient for callers, but there are restrictions on binary transmission, such as uploading pictures, etc.), or based on plain text, such as WSDL, XML, and so on. What is the token mechanism of the user authentication? Are there any considerations about how to do the QoS in the docking party?
Lin Hao went: for an open platform, basically Facebook is currently leading the technology of open platforms, so most of the protocols are HTTP, and interfaces are designed to be restful, and the token mechanism for user authentication is usually a matching of a public key, And this token must be provided by the Open Platform company; The open platform is sure to have access to the QoS restrictions, and this often affects the open platform of the charging standards, in the implementation of most of the real time based on caching for real-time cost calculation, this is more strong should be the telecommunications industry. 0 p9 ~ T! _/L7 e/' $ p:m (a
Wang: Across the IDC Deployment Program module is inevitable after the business development to a certain stage, the line resources across IDC is relatively limited. How can the architect reasonably plan and use the same city, across the city line for transmission of data, as well as the accident disaster recovery measures. & D) L) S1 W9 v! |
Lin Hao went: there are really high technical difficulties in deploying across IDC, verification of the results of the deployment is the most critical, followed by the bandwidth and time costs of deployment, and the usual method for deployment results Validation is the testing of business scripts; Multicast technology is often required for deployment of bandwidth costs , it is often necessary to use automated deployment systems for time costs.
Wang: Web2.0 Web site's mass of small file storage, such as user avatar, photo album Miniatures and other documents, these files are characterized by small size (within 100KB), the number of large (millions), these files are stored, read, backup is a problem, I ask you how to provide specific solutions.
Lin Hao went: At present, internet companies, such as Google and Youku, have their own set of solutions for small files or large file storage, and do not rely on high-end storage devices to solve them. On the one hand is the cost problem, on the other hand the scalability problem, so for these files storage, read and backup most of the use of a GFS-like scheme or directly with the HDFs solution provided by Hadoop.
Wang: Internet product deployment is a key link, many internet companies still take the manual deployment release product version of the way, but this approach is more complex and inefficient, often very error-prone, if the simultaneous release of several products, if the relationship between the products are relatively close, One of these publishing errors will affect other releases, and as an architect, how do you solve such problems in your day-to-day work? Whether or not your team is considering automating dynamic deployments, what is the specific scenario?
Lin Hao went: On the issue of deployment, it seems that only a few foreign internet companies are doing well, the most typical of which is ebay. Ebay has made an automated deployment system many years ago, in this system, ebay can be a release of several products in a dependency analysis to determine its release order, and to achieve automatic release, checksum and rollback, this system is believed to be the current Chinese Internet companies are pursuing the goal.
Wang: As an Internet technology architect, can you briefly summarize the concepts, principles, and methods of ocean Internet service technology architecture?
Lin Hao went: I think ebay's five-point summary is pretty much complete:
(1) "Split", the separation of the database and the application of the split, of course, this requires strong technical support, this point to achieve the goal is usually easy to apply the infinite horizontal telescopic; 7 U (r (S) K; [
(2) asynchronous, which requires business permission;/M! \; h) G0 G8 u& t* K3 Q
(3) automatically automatically, just like an automated deployment system, 8 L0 x# q:d* t/v
(4) It is very important to remember all the failures;
(5) Tolerance inconsistency, the meaning of this sentence is to use as little as possible strong business, but the use of final consistency of such programmes. 3 x1 R. ^# Z (c: {, C
Of course, in addition to the above five points, there are more like caching, the implementation of the key technology (to control stability, performance and timely response) and so on. 1 d. ~ "X.? 4 G2 ~# t+ C9 |
Wang: There are a lot of good software architects, but because of the lack of the vast number of service technology applications and practice opportunities, can not be very good for the vast number of service Application architecture design, you can give them some valuable advice to share how you have been learning to grow up. What are the ways and means to improve your technology vision, such as which books you can recommend and which good sites to recommend? , O) H9 ^! S/g1 G9 _8 y

Lin Hao went: The question is in the idea, many architects do not know how to deal with large, high concurrency scenarios, and the main reason is the lack of opportunities for such practices, which are currently available only in large enterprise systems or the Internet, and are often difficult to fully understand without the opportunity to practice. Most of the time, the technology program in the Internet has been growing up with a lot of blood and tears, the proposal can only read a variety of Internet technology introduced articles, such as Google shares a lot, there are many Internet companies also have a lot of technical architecture articles, especially the introduction of the technology development process, You can imagine if you encounter such a problem, how to solve, perhaps this can slowly grasp and understand the large, high concurrency system solutions. Books on the current domestic various high-performance books are also beginning to emerge, for example, "MySQL performance Tuning and architecture design", "Building high-performance Web Sites", "Building Oracle High Availability Environment" and so on, these high-performance books are usually derived from the author's personal experience, is very worthwhile to learn And know that if you want to be high-performance, it usually means you have a good grasp of the software (including the OS, etc.) and the hardware technology, so the books like "deep Understanding jdk", "deep understanding of the Linux kernel", and "deep understanding of computer systems" are well worth reading. As for the website aspect, like http://highscalability.com/, http://www.javaperformancetuning.com/these are very good website.


This blog post is reproduced from http://my.oschina.net/cuilk/blog/13675

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.