This article mainly based on the theory, we suggest that you read the relevant reading, is about foreign large photo sharing site Flickr http://www.aliyun.com/zixun/aggregation/11116.html "> Website Architecture Program Research, Very practical and useful.
Learning and mastering the construction of large Web sites, the need to collect scattered articles, comb the fragmented content. It is meaningful to do the work well, but it is also more difficult. Our experience is that you might as well grab a few of the following topics, analyze the examples of large Web sites, and then compare them horizontally.
1. Database
Data storage has always been troublesome, especially when it is necessary to store large amounts of data, often a single database is not enough, even a database cluster is not enough. A common solution is segmentation, such as splitting a mass of data into chunks by user ID, and storing each chunk in a separate database. But the split approach reduces the efficiency of the join operation.
How efficient is Google bigtable? What are the benefits and what are the drawbacks? BigTable What kind of situation is the most suitable for? Open source software based on bigtable principle, how efficient is Hadoop/hbase?
2. Cache
When users visit a Web site, they usually read more frequently than they write. In order to improve the reading operation, it is advisable to cache related content into memory to reduce the consumption of disk IO.
MemCached's recent hot, Wikipedia, YouTube, Digg, Twitter, and many other large websites are using MemCached as a caching tool. Squidcache and varnish, etc. tools, also with the cache. Twitter's approach is to combine memcached and varnish and use them at the same time. What kind of content, what kind of caching tool should be used? How to coordinate between different tools? What are the experiences and lessons of the actual operation of each major website?
3. File System
Some of the content, neither necessary to be stored in the database, nor suitable for storage in the cache, such as log and images. In this case, we need the file system. When there is a huge amount of content to be stored in the file system, we need to use the Distributed File system. What scenarios does Google file system apply to, and what scenarios do not work? Distributed file systems often require corresponding locking mechanisms to ensure that concurrent read and write operations do not interfere with each other. What's the benefit of chubby?
It is said that MogileFS is more suitable for storing large amounts of documents, but small size of the monomer, such as images. And Google file system is more suitable for storing large size but not many files. Is it possible to combine small sizes of multiple files into a large file and then store it in Google File system. In this case, compare mogilefs and Google FS performance, whether there is a difference?
4. Thread Management
A set of operations usually consists of several tasks. Multithreading is a thread that is solely responsible for the operation of the entire operation. Another approach is to cut the process into sections, each section by one or several threads responsible for this method called the workbench.
Common is a multithreaded approach. But the work table's practice is advantageous to the centralized computation resources processing heavy task, avoids the bottleneck the appearance. But the flaw is the need to pass between the different threads, the record intermediate state data. What kind of situation is suitable for multithreading, when to use the workbench?
5. Scheduler
The same site usually provides a variety of services, and different services need to invoke different business logic. Some business logic can be done on the same server, but when the business logic is complex, multiple servers need to be invoked to complete the collaboration. Different services of the audience object, traffic is different, different periods of traffic is also different, the same period of different service flow is also different, so the need to dynamically allocate computing resources. This is scheduler's job.
Scheduler to assign work to different servers, the easiest way is to start a program that is preinstalled on that server. Because there is no guarantee that every program is perfect, when a program error, should avoid the entire server to collapse, affecting the normal conduct of other work. Is there a need to use virtual machine to isolate the various jobs from each other?
6. Signal and Data Flow
Large Web site background system is often composed of many servers, server and server between the occasional data exchange, such as Web server resolution of user requests, the request forwarded to a certain app server, this app server completed a part of the work, Forward the intermediate data to the next app Server. After the second app server completes the task, the whole work is over and the result should be returned to the Web Server.
The question is how to get the first app server to know how to give the intermediate results to the second app server, and how does the second app server know that its destination is Web Server? A more efficient approach is to distinguish between data flow and control flow. A permanent channel between server and server, designed for control flow use, and delivery instructions to control the delivery of data streams. Data flow does not occupy the control flow path, only when needed, to establish a channel of data flow.
The organization of control flow and data flow needs to combine specific business logic to optimize design, reduce bandwidth consumption and shorten data transmission time.
7. Instrumentation
The background of every part of the site is functioning normally, where is the bottleneck, where idle. These require real-time monitoring. Not only in time to avoid the entire background of the collapse of the system, but also can analyze the operation of various parts of the law, so as to find the way to optimize the system
The question is, what monitoring tools should be used to minimize disruption to system programs while providing valuable information?
8. Anti-Abuse
Usually the website is faced with a variety of users, most of the user's behavior is friendly, but do not rule out a few users deliberately mischief. If there is no prior design precautions, a small number of malicious users of the arbitrary, will interfere with other users to enjoy normal service.
The question is, how to prevent and prevent the occurrence of malicious acts in time?
9. Exception Handling
No matter what the assumptions are, there will always be such contingencies in the actual operation. For example, the emergence of sensitive words, often without warning. Therefore, in the design of system architecture, network management should be provided with the necessary tools to deal with unexpected events.