Taobao Technical Experts: design and tuning of large-scale trading websites

Source: Internet
Author: User
Tags split

The 2009 is a year of challenges and opportunities, and for most people, the financial crisis has been used and efforts have been made to resolve the crisis. In the technical circle is also the same, be laid off certainly also found a job, so all in the practical technology. To get back to the story, let's talk about 2009 years of stories, find a memory, and have fun.

Discussion and summary of data extensibility

The financial crisis is the opportunity of E-commerce, so 09 is a year of rapid development of Taobao. When a website from million, tens of millions of recorded data scale, growth to billion, 1 billion, billions of recorded data scale, is a quantitative to qualitative change process, simple hardware upgrades have reached the bottleneck, but need to make a fuss on the overall structure. 09 a year, most of the time in the expansion of data efforts.

For an E-commerce site, orders are the core of the data, but also the fastest growing data. For the expansion of data, the most traditional and most simple and effective mode is the database of the sub-Library. What spark will come out when the order and the Sub-Library table meet? At the beginning of the 09 the collision was very long, the result of the spark is very small. The biggest problem is the rules for data partitioning, no rules of the horizontal segmentation will certainly bring the cost of data consolidation, and according to business rules split, because the buyer and the seller's query needs are different, resulting in data can not be divided, the only viable spark is the order of two save, the buyer of each seller, only a higher cost, And the requirement for data synchronization is very high.

So we initially decided to split the order in two separate ways, and one day, carefully look at the situation of order access, found that the order database more than 90% of the pressure from the query, and the query more than 90% of the pressure from the Non-core business, is only the display of order data, consistency and real-time requirements are very low.

Because the data volume is big, causes the database pressure to be big, the natural thought is disperses the pressure, its method is the storehouse to divide the table. Some times we think the problem may be directly, since the pressure is big, can reduce the pressure? By understanding the order access situation, found that the expensive main database, there are more than 80% of the pressure to give unimportant needs, this is the key to our optimization, so the order finally adopted a read-write separation of the program, the high cost of the main database solution and important inquiries business, more than 80% unimportant reading, To the Low-cost database server to solve, while the data replication requirements are very low, not too difficult to achieve.

Another interesting case is the expansion of the product data, the level of the product segmentation is very easy, according to the seller to split. With the precedent of order, the first thought of the separation of read and write, because the cost can be done low. After the implementation of a period of time, and carefully reflect on the overall demand for commodities, suddenly found that the goods do not actually need and order the same requirements, be sure to adopt a high cost of the main database? Is it feasible to use all the Low-cost common servers to do the database? After careful evaluation, the findings are acceptable, and this leads to a part of the work that has previously been started in the separation of items from the merchandise read and write!

The story is finished always a little summary, to a point of virtual first: a clear understanding of the original requirements is a prerequisite for system decision-making, otherwise detours must go, and the original requirements of the understanding is not easy, there will be a lot of interference and resistance, the previous example looks very simple, but in a system that has been running for 5 years to understand the essence, To make changes, and it's not that easy. In addition, experience sometimes becomes a system of decision-making obstacles, this is very contradictory, so need to have a zero mentality to think about the problem. In the final analysis, return to Origin.

A little bit more practical, for large distributed system data access, a unified data layer is very necessary, encapsulation level, vertical data segmentation, encapsulation read and write separation, encapsulation data access routing, replication, merging, relocation, hot processing and other functions, and to the application of transparent, applied targeted, Can be packaged at the JDBC level, database-specific, can be packaged in the database protocol layer, such as amoeba.

Focus on the interaction between the system and people

There is also a story, in the previous version of the data layer, in order to achieve transparent routing, has used no SQL, all the database access is written code to do. After the on-line discovery a very painful problem, cannot and the SQL correspondence, the platoon error is very difficult. Once the DBA found that the database last query consumed too much resources and improved the optimized SQL to the developer, and the developer did not find a specific query for several days.

Another feeling in the 2009 is the implementation of the industry's service, many organizations are implementing the service, the system level is very successful, communication, load balancing, message systems, service containers, etc. have a lot of results, but the implementation of a period of time after the effect is not very good, rely on complex, change chaos, inefficient. The fundamental, is not enough attention to people, the lack of product service operation, lack of service governance.

The above two examples are the lack of attention to people, technicians do systems, most of them are more concerned about technology, and ignore the creators and users of technology-people. The testability of software or services is the concern of testers, maintainability and manageability, and the ease of use of a framework is a concern for all users. Unless you can make your own evolutionary Skynet (note: Skynet (Skynet) appears in The Terminator series, a computer-based artificial intelligence defense system created by humans in the late 20th century, initially for military development. , or you should pay more attention to the interaction between the system and people.

Focus on availability

There is also a feeling that the industry is not paying enough attention to this basic indicator of usability. Almost every frame will say how high and how good it is, it's very rare to mention how strong monitoring is, how easy it is to make mistakes, and how to do segregation in the case of failure, and how to downgrade; From this point of view, commercial products do a lot better; about performance-related articles search, a lot, a variety of optimization strategies, A variety of optimization methods, and usability, found that the systematic knowledge is really very little; I hope I don't know much about it.

Looking back at the past and looking into the future. 2010, a lot of things can be done, service-oriented system isolation and demotion, improved system maintainability, and a synergistic and asynchronous mode in the full use of Web applications ...

Disclaimer: I am very realistic, in order to solve the problem and complete the work unscrupulous, and do not understand what the structure means, the above view if there is a similar, pure coincidence! If there are objections, welcome to the Brick!

Personal profile: Yeu Xuqiang, Taobao technical experts. 2004 joined Taobao, witnessed the Taobao business and technical integrity of the development process; In the past 5 years, has participated in Taobao almost all the core system transformation, and led to the development of the future of Taobao to support the core business center of the construction. Yeu Xuqiang is now responsible for the design and planning of the overall business architecture of the website and has extensive experience in the design and tuning of large-scale trading sites.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.