Comparison of the classical technology to popularize the post you just bought a thing on Taobao _ other comprehensive

Source: Internet
Author: User
But first you will find that you in different regions or different networks (telecommunications, Unicom, mobile), the conversion of the IP address is likely to be not the same, this first involves load balancing the first step, through DNS resolution domain name will be your access to different portals, At the same time, make sure that the entry you visit is the one that is likely to be faster in all the entrances (this is different from the CDN in the following text).

You have successfully accessed the actual portal IP address of Taobao through this portal. Then you generate a PV, which is page View, which pages access. The total PV amount per website is an important indicator of the size of a website. Taobao NET on weekdays (non-promotional period) PV is about 162.5 billion between. At the same time as an independent user, you visit the Taobao all pages, are counted as a UV (Unique visitor user access). The latest notorious 12306.cn solar PV volume is about 1 billion, and UV is much smaller than Taobao 10 times, which is why I believe everyone will know.

Because the number of people visiting Taobao at the same time is too large, even the server that generates the first page of Taobao, it is impossible to have only one. There may be hundreds or thousands of servers used only to generate the Taobao home page, and the task of generating pages for you at one visit will be assigned to one of the servers. This process is meant to be fair, fair, and average (and the number of users on each of these hundreds of servers is similar), a complex process that is accomplished by several systems, the most critical of which is LVs (Linux Virtual Server), one of the world's most popular load-balancing systems, It was developed by Dr. Zhangwensong, who is currently serving in Taobao.

After a series of complex logical operation and data processing, the HTML content of the home page of Taobao that you see this time will be generated successfully. Children's shoes with a little common sense on the front end of the web should know that the next browser will load the CSS, JS, pictures, scripts, and resource files used in the page. But perhaps relatively few students will know that your browser under the same domain name concurrent load of resources are limited, such as Ie6-7 is two, IE8 is 6, chrome versions are not the same, generally 4-6. I have just looked, I visited the Taobao home page needs to load 126 resources, so the number of concurrent connections so small will naturally load for a long time. So front-end developers tend to distribute these resource files under a number of domain names, disguised as bypassing the browser restrictions, but also for the following CDN work to prepare.

According to unreliable news, in the double 11 peak day, Taobao's access to traffic peak reached 871gb/s. This number means that 1.78 million 4Mb bandwidth broadband can be affordable and fully capable of bringing down the entire Internet bandwidth of a small and medium-sized city. So obviously, these traffic flows cannot be lumped together. And as we all know, different areas of the network (telecommunications, Unicom, etc.) between the exchange will be very slow, but you found that very little Taobao visit slow. This is the role of the CDN (Content Delivery Network), the contents distribution network. Taobao has built hundreds of hundred CDN nodes throughout the country, using some means to ensure that you visit (here refers to JS, CSS, pictures, etc.) place is closest to your CDN node, so that the large flow of traffic scattered around the acceleration node.

This has a problem, that is, if a seller released a new baby, upload a few new baby pictures, then Taobao to ensure that the CDN nodes all over the country will be synchronized in the presence of these pictures for users to use it? This side involves a lot of content distribution and synchronization of related technologies. Taobao has developed the Distributed File System TFS (Taobao file systems) to handle this type of problem.

Well, then you finally finished loading Taobao home, then you habitually in the home page search box entered the ' Sweater ' and hit enter, then you have a PV, then, Taobao's main search system will begin to serve you. It first of all you entered the content based on a word breaker for word operation. As we all know, English is a unit of words, words and words are separated by space, and the Chinese word is the unit, all the words in the sentence to describe a meaning. For example, English sentence I am a student, in Chinese is: "I am a student." Computer can be very simple to know student is a word, but it is not easy to understand the "learning", "Sheng" two words together to represent a word. The Chinese character sequence is divided into meaningful words, that is, Chinese participle, some people also known as cutting words. I am a student, the result of participle is: I am a student.

After the word segmentation, but also according to the search terms you entered to carry out your shopping intention analysis. Users to search often often have the following categories of intentions: (1) Browse Type: No clear shopping objects and intentions, while looking at the buy, users more casual and perceptual. Query for example: "2010 10 Big Perfume Rankings", "2010 Popular Sweater", "Zippo how many kinds?" "(2) query type: There is a certain shopping intention, reflected in the requirements of the attribute. Query For example: "Suitable for the elderly mobile phone", "500-dollar Watch", (3) Contrast: Has narrowed the shopping intention, specific to a few products. Query For example: "Nokia E71 E63″," AKG k450 Px200″; (4) Definite type: The basic decision has been made, focusing on an object. Query For example: "Nokia N97″," IBM T60″. By analyzing your shopping intentions, the main search will show a completely different result.

After a few steps, the main search system lists the search results based on the above and more complex conditions, which are done by more than 1000 search servers. Then you start clicking through the search for the baby. You start looking at the Baby details page. Regular online shoppers will find that when you buy a baby, even if the merchant changes the Baby details page many times, you can still see the snapshot at that time through the ' bought baby '. This is to prevent businesses from committing to the goods in the details of the word. So obviously, it's not a simple thing to save and quickly call a snapshot of dozens of of billions of dollars a year on a transaction. This involves several sets of systems of joint collaboration, which is more important is tair, Taobao self-developed distributed kv storage solution.

Then, whether or not you actually make a deal, your access behavior is faithfully documented by the system for subsequent business logic and data analysis. Access to the log records is one of the most important records, but we know that these visits are distributed across many different servers in various regions, and because of the large number of users, these logs are very large and very normal TB level. So in order to quickly and timely transmission synchronization of these log data, Taobao developed a timetunnel for real-time data transmission, to the back-end system to calculate the report and other operations.

Your browsing data, trading data, and many other data records will be preserved. So that the history of Taobao storage data easily reached a dozen or even more PB (1PB=1024TB=1048576GB). Such a huge amount of data is stored in the data warehouse of Taobao by the limit of 1:120 of Taobao system. And through a called ladder, by more than 2000 servers composed of large scale data systems are continuously analyzed and dug.

From these data Taobao can know how small to who you are, what you like, your child a few years old, whether you are in love, like playing World of Warcraft people like what kind of drinks, big to all walks of life, the retail situation, the rise and fall of all kinds of goods and so on the mass of information

  Said so much, in fact, is only a description of Taobao on the running of the tens of thousands of systems in a few. Even if you only visit the first page of Taobao, the technology and system scale are all you can not imagine, is Taobao more than 2000 top engineers of the painstaking crystallization, including even the Yangtze River Scholars, the National Science and Technology award winners and many other Daniel. Similarly, Baidu, Tencent and other business systems are no more simple than Taobao. What you need to know is that the Internet product that you use every day seems to be simple and easy to understand, but it's hard to imagine the wisdom and labor behind it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.