Anatomy Twitter "5" Data flow and control flow

Source: Internet
Author: User
Keywords SMS nbsp;
Tags apache browser cache caching channel control data find



"5" Data flow and control flow





said, Twitter has two main points, caching (cache) and Message Queuing. The role of Message Queuing is to "isolate user requests and related operations in order to flatten traffic peaks (move operations out of the synchronous request cycle, amortize load over time)".





by letting the Apache process empty loop, quickly accept the user's access, postpone the service, plainly is a tactic, the purpose is to let users not receive "HTTP 503" error prompts, "503 error" refers to "service is not available (services unavailable)", The site is denied access.





Dayu Water, emphasis on dredging. The real ability of flood fighting is embodied in two aspects of flood storage and flood discharge. The flood storage is easy to understand, is to build a reservoir, or to build a large reservoir, or build a lot of small reservoirs. The spillway consists of two aspects, 1. Drainage, 2. Channels。





for the Twitter system, a large server cluster, especially a large cache of memcached, embodies the capacity of the flood storage. The means of drainage are Kestrel message queues, which are used to pass control instructions. Channel is the data transmission channel between machine and machine, especially the data channel leading to memcached. The advantages and disadvantages of the channel is whether it is unobstructed.





Twitter design, and Dayu's approach, the shape of alike, real close. The flood control measures of the Twitter system are effective in controlling the data flow, ensuring that the data can be evacuated to multiple machines in time when the flood peak arrives, thus avoiding the excessive concentration of pressure and the paralysis of the whole system.





June 2009, Purewire climbed the Twitter site to track the "Chasing" and "chasing" relationships between Twitter users, estimating the total number of Twitter users at around 7,000,000 [26]. Among these 7 million users, they do not include orphaned users who neither chase nor be chased by others. Also does not include the isolated island crowd, the user of the island only chases and is chasing each other, does not contact with the outside world. If you add these orphaned and isolated users, the total number of Twitter users is probably no more than 10 million.





As of March 2009, China Mobile has reached 470 million users [27]. If China Mobile's flying letter [28] and 139 lobbyists [29] also want to go to Twitter direction, then the letter and 139 of the flood fighting capacity should be designed to how much? Simply put, the current size of the Twitter system needs to be magnified at least 47 times times. So some people are commenting on the mobile internet industry, "something that can be done in China, in America." Conversely, not tenable. "





but in any case, the stone of the mountain can attack Jade. This is the purpose of our research on Twitter's system architecture, especially its flood control mechanism.








Figure 7. Twitter Internal flows


courtesy Http://farm3.static.flickr.com/2766/4095392354_66bd4bcc30_o.png





Below is a simple example of the internal process of the Twitter site, which examines the mechanisms of the Twitter system to achieve the three elements of flood control, "reservoir", "drainage" and "channel".





assumes that there are two authors who post text messages on Twitter via a browser. There is a reader, also through the browser, to visit the website and read the text they wrote.





1. The author's browser establishes a connection to the Web site, and the Apache WEB server assigns a process (Worker process). The author logs in, Twitter looks for the author's id, and as a cookie, the memory is in the header attribute of the HTTP parcel.





2. The browser uploads the author's new text message (Tweet), and Apache receives the text message and forwards it to mongrel Rails Server, along with the author ID. Then the Apache process goes into the empty loop, waiting for mongrel to update the author's homepage and add the new text.





3. Mongrel receives a message, assigns an ID to the text message and caches the SMS ID and author ID to the vector memcached server.





At the same time, mongrel lets vector memcached find out which readers "chase" the author. If the vector memcached does not cache this information, the vector memcached automatically goes to the MySQL database to find the results and cache them for future needs. Then, return the reader IDs to mongrel.





then, mongrel the text message ID and text message, cached to row memcached server up.





4. Mongrel notifies the kestrel Message Queuing server that a queue is opened for each author and reader, and the name of the queue implies a user ID. If these queues already exist in the Kestrel server, then the queues are used.





corresponds to each text message, mongrel has already known from Vector memcached which readers are chasing the author of this message. Mongrel the ID of this message into each reader's queue and the author's own queue.





5. The same mongrel server, or another mongrel server, resolves the corresponding user ID from the name of the queue before processing a message in a Kestrel queue, which can be either a reader or an author.





then mongrel from the Kestrel queue, extracts the message one by one, parsing the SMS ID contained in the message. And from the row memcached cache, find the text message corresponding to this SMS ID.





at this time, mongrel both got the user ID, also got the text message. Next mongrel to update the user's homepage, add the text of the text.





6. Mongrel the updated author's homepage to an empty loop of Apache processes. The process actively transmits (push) the author's homepage to the author's browser.





If the reader's browser has previously logged on to the Twitter site and established a connection, Apache has also assigned a process to the reader, and the process is also in an empty loop state. Mongrel the updated Reader's homepage to the corresponding process, which sends the reader's homepage to the reader's browser.





, the process seems uncomplicated. "Reservoir", "drainage" and "channel", this three elements of flood fighting is embodied in where? What's the beauty of under's Twitter? It's worth a lot of scrutiny.





Reference,





[num] Twitter user statistics by Purewire, June 2009. (http://www.nickburcher.com/2009/06/twitter-user-statistics-purewire-report.html)


[27] As of March 2009, China Mobile has reached 470 million users. (http://it.sohu.com/20090326/n263018002.shtml)


[28] China Mobile flying letter net. (http://www.fetion.com.cn/)


[29] China Mobile 139 lobbyist network. (http://www.139.com/)


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.