Uncover eBay architecture and storage
Ebayd web field has set up a model. From a series of data, we can see that the data and computing workload of such a ultra-large-scale website have very high technical requirements. In order to ensure the normal access of more than 0.2 billion users around the world, in addition to the excellent technical architecture, the massive data generated every day also needs to be carefully stored. What a developer and information management personnel are doing with such an arduous task. For this reason, this magazine has organized two eBay technical articles to serve readers.
Extension: not only about architecture
Text/Frank Sommers translator/dawn
In the 2006 SD Forum, two eBay architects gave speeches on two topics: How does eBay's architecture handle billions of page access requests every day, and how the architecture evolved from the original Perl script to the current 10 thousand million applications running in eight data centers. The Speech concluded that expansion is only a problem in the architecture.
Evolution of eBay System Architecture
In the 2006 SD Forum, Randy Shoup and Dan Pritchett both spoke on eBay architecture in conjunction with eBay. Pritchett then posted a presentation slide in his blog titled eBay architecture.
Unexpectedly, he mentioned some amazing statistics in his speech, as shown below:
L 0.2 billion million registered users
L 1 billion visits per day
L 26 billion SQL queries and updates per day
L more than 2 trillion pieces of data
L goods transactions worth $1590 per second
L storage of over 1 billion Images
L 7 different languages
L 99.94% system availability
Other statistics related to the development process and features are as follows:
L more than 300 new features are added to the website each quarter
L more than 100,000 lines of code are released every two weeks
According to his speech, although eBay's architecture has reached such a large scale, eBay expects to achieve its goal in just a few years-to handle an additional 10 times of traffic growth. Another architecture goal is to be able to handle peak loads, and ensure that components can safely stop working without being damaged under heavy loads or system paralysis.
According to the speech, currently, eBay's system architecture is moving towards the fourth version. Of course, the most attractive technical part in the speech is the various technical information about this version. For example, the speaker tells the first step of extending the application layer: discard most of the J2EE features. Instead, they noticed that "eBay uses servlets and a rewritten connection pool for expansion ."
According to the presentation, another attractive aspect of application layer expansion is that the application layer does not store session state information at all. Instead, "Save the transition status in the cookie or scratch database ." To achieve data access, eBay uses an internally developed Java o/R ing solution.
In terms of expanding the search for the website, the speaker noticed a different demand, which is not a problem for a general Web search engine like Google, namely: eBay users expect to be able to immediately query their changes to the data in the search results. Likewise, the auctioneer knows exactly the search results they expect-for example, the items they just listed must appear in all relevant search results. Obviously, it takes nine hours to update the index of a single search before the latest version of eBay's re-architecture emerged.
The speaker talked about a lot of similar challenging problems and discussed the solution to the problems in depth. However, the topic that interest me most in my speech is an introduction to the evolution of eBay architecture. On this topic, we need to think about the architecture of the first version, for example:
L one weekend in 1995, Pierre Omidyar constructed the first version of the architecture.
L each project is a separate file generated by a Perl script
L no search, only browsing by category
L system hardware is assembled by product parts that can be purchased at the FRY store
From 1995 to September 1997, eBay has been using this architecture. As mentioned in the speech, eBay was already a famous website, and its architecture reached the highest value of 50 thousand items.
The next several iterations brought eBay's architecture into the three-tier architecture phase, initially on Microsoft's IIS server and then transferred to Java. Several final versions indicate that many J2EE features need to be abandoned to meet eBay's unique needs with a highly customized architecture.
One idea about the four major architecture versions Ebay has experienced is that these four versions are an evolution. However, the other idea is that these four versions form a complete circle: at the beginning, we adopted a custom design solution, and finally switched back to the custom solution.
Based on the introduction of different architecture stages, I would like to know to what extent the eBay architect has reached in solving the urgent issue of performance-layer extension, they want to achieve scalability in the system to handle future loads. To what extent has this goal been achieved. Even for the future, architects need to predict the scalability of systems at certain assumptions in the future. To what extent have they achieved this capability?
One problem with these predictions is that even if a large amount of data stored in the current operating system is available, the usage mode of the system will change-for example, users may start to like videos rather than simple images, or voice phones as part of system interaction. According to the speech, the changes in these use modes can come very quickly, especially when the average Architecture Lifecycle changes to about 2-3 years. For example, few people have heard of YouTube two or three years ago. However, in the company's two or three-year life cycle, millions of users have become accustomed to online video.
Implementation Scaling: Organizational Capability + Architecture
I think this last debate is the main information of this eBay speech. For me, the most amazing aspect of eBay's architectural revolution is not just the excellence of the solution technologies used in each architecture period, this is also the fact that eBay is able to meet the various challenges it has encountered through continuous improvements to the system, and all stages of efforts have made this website lasting.
Interestingly, it suggests that you start from almost any architecture-or even use Perl, rails, or JSP pages-when you need to expand your application, as long as you know how to move to the next step and have the ability to implement it. In turn, it also suggests that the key to scalability is not simply how to scale between each architecture stage, but how a company or organization can push applications from one architecture stage to the next. This shows that expansion is like a technical issue for individuals or organizations.
Of course, this is nothing surprising, because, like architecture design, expansion can always be achieved. (The last part of the eBay speech explains the topic of scalability-for example, how 10 thousand Application Instances are managed through eight data centers .) However, if you look at expansion from a broader perspective, the two common expansion methods may be useless in reality.
One way is to overemphasize the design of scalability from the very beginning. Most developers know that architecture Extensions cannot be identified from the very beginning, but in some cases, architects still prefer to spend too much effort trying to design an architecture that can meet application needs for a long time. Pierre Omidyar basically does not agree with this idea, which is why he chooses to use Perl scripts in his initial version and one file mechanism for each project, instead of using a permanent approach.
The second argument about scalability is that scalability and performance are considered only afterwards, and it is opposed to considering scalability in the initial stage of application development. XP advocates sometimes use this argument to justify their willingness to write code quickly, rather than considering how the code will be extended in the future to handle future application workloads.
In fact, these two ideas are of little use. In a more practical view, the third view is to use expansion as part of the organizational or even business-layer capabilities. It is very difficult to predict the future workload. We need to recognize this point, so if we can predict it, this idea will be mainly used in this architecture-to handle recent expansion, in addition, quick deployment of features is allowed so that actual users of the application can generate business rationality for supporting future architecture updates. However, it is far different from developing an organization or even business capability to be able to handle system architecture changes. This seems to be the point where the eBay architect gave a speech on the 2006 SD forum.
So when will you consider implementing scalability in your project?