Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
"Why are a lot of seemingly not very complex sites, like Facebook and Taobao, that need a lot of top players to develop?"
Answer: Zi Liu, Taobao handyman Farm
Take Amoy Bora said, as to give new people some popular science.
First say what you see on the page, the most important few:
"Search for goods"--this feature, if you have thousands of items, can be done with a select operation. But--When you have 10000000000 (10 billion) items, no database can be stored, how do you search? Here needs to use the Distributed data storage scheme, in addition to this search can not directly from the database to fetch data, it is necessary to use the search engine ( Simple search engine faster). Good, can search out the goods, whether it can be done to kiss one? It's early, whose goods appear on the first page? You need to use a huge, complex sorting algorithm. If you do some personalized recommendation based on your buying behavior--this is a bunch of bull-fork algorithmic engineers who have struggled for life.
"Product Details"-The search is complete, see you are interested in, click to view the product page, this page has the attributes of the product, detailed description, evaluation, seller information and so on, this page shows the number of times per day in more than 3 billion, the same reason, if you do a site every day 10 people visit, You do not feel the pressure of the server, but 3 billion, the problem to be solved more. First, these requests cannot be directly pressed onto the database, any stand-alone or distributed database, under the pressure of 3 billion per day, will collapse to a total lack of happiness, in this case to use the technology is a large-scale distributed caching, all the sellers information, evaluation information, product description is from the cache to get to, Even the more extreme point of "product Browsing volume" This information, every open page to refresh, you guess can be from the cache to fetch it? Taobao did it, the whole product details are in the cache.
"Product picture"--a product has 5 pictures, the product description has more pictures, you guess how many pictures taobao to store? More than 10 billion. How do you find one of these pictures if it's on your hard drive? If your classmate wants to copy your picture, how many hard drives do you need him to prepare? How much bandwidth do you need to configure? Is your network card able to withstand? How long will it take you to copy him? Such a scale, Unfortunately there is no commercial solution in the market, and ultimately we have to develop a storage system on our own, and if you've heard of Google GFs, we're like him, "TFS." By the way, Tencent also has such a set, also known as TFS.
"Advertising system"-Taobao has a lot of ads, what, you do not know? That means that our ads do a good job, incredibly many people do not think it is advertising, sellers how to bid to buy Taobao ads bit? How does the advertisement show? How do I look at the advertising effect? This is another set of algorithms physique system.
"Boss System"-Taobao staff how to manage such a large system, for example, a moment suddenly announced that a writer's work all disappeared from Taobao, from the database to the search engine to the advertising system, the relevant data in a few minutes all disappeared, which requires a cow fork back support system.
"Operation System"-support such a large site, how many servers do you think it will take? That's a fraction. With so many servers, what operating system is deployed above, can the kernel of the operating system be optimized? Can the Java virtual machine be optimized? Does the communication module have the ability to squeeze the space? How does the software deploy? You've installed the operating system, optimized it, been 360 holes, broken down? There are many doorways.
No more writing, in addition to these mentioned above, there are a lot of technology to be done, of course, it is not how unattainable these things, any complex huge things are from small to large to do, inside the need for a cow fork to the big Benniu, also need to be full of curiosity rookie, finally this sentence, you when I have ulterior motives.
Answer: Shup, Facebook engineer
Features are not complicated, but there are a lot of details to be done. For example, the recommendation algorithm in news feeds is important, based on the user's previous records and relationships with friends to generate. In addition, according to the user's information and behavior, to do machine learning and data mining, so as to pick out the most matching ads. This is also a more labor-intensive thing.
Facebook also has a huge number of users. Assuming you're just doing a social networking site for internal use, that's pretty straightforward. But if you think about billions of people on it. You must first server is a distributed cluster, but also to ensure that the ability to withstand so much traffic. At the same time for the ability to be good, have to add mem cache and Web page block loading functions. There is a daily total amount of data (status, message, photos, sharing, etc.) have TB order of magnitude, your database and so on.
Also, catches, you need a strong security team to keep your site safe from attack, and to prevent spam and disgusting advertising or programs from spreading. There is also the problem of multiple languages brought about by globalization.
In a word, a website bigger, many problems will produce, not in the campus to do a semester homework so simple.