To learn and master the architecture of building large websites, we need to summarize the scattered articles and sort out the scattered contents. It is meaningful but difficult to do this well. In our experience, we may wish to grasp the following topics, analyze the instances of large websites one by one, and then compare them horizontally.
1. Database
Data storage has always been troublesome, especially when massive data needs to be stored, the capacity of a single database is often insufficient, and even a database cluster is insufficient. A common solution is segmentation. For example, you can divide massive data into several blocks by user ID and store each block in an independent database. However, the splitting method reduces the join operation efficiency.
What is the efficiency of Google bigtable? What are the advantages and defects? What scenarios is bigtable most suitable? How is the hadoop/hbase operating efficiency of open-source software based on bigtable principles?
2. Cache
When users access a website, they usually read more frequently than write operations. To improve read operations, You Can cache the relevant content to the memory to reduce disk Io consumption.
Memcached has become popular recently. memcached is used as a cache tool for large websites such as Wikipedia, YouTube, Digg, and Twitter. Squidcache and varnish tools are also tied to the cache. Twitter combines memcached and varnish. What kind of content should I use? What kind of cache tools should I use? How can we coordinate different tools? What are the experiences and lessons learned from the actual running results of various websites?
3. File System
Some contents are neither stored in the database nor stored in the cache, such as log and images. In this case, we need a file system. When a large amount of content needs to be stored in a file system, we need to use a distributed file system. What scenarios does Google file system apply? Distributed File systems often require a lock mechanism to ensure that concurrent read/write operations do not interfere with each other. What are the advantages of chubby? Under what circumstances does it not apply?
It is said that mogilefs is more suitable for storing large numbers of files, such as images. Google file system is more suitable for storing files with limited sizes. Is it possible to merge multiple small-sized files into a large file and store it in Google file system. In this case, is there a higher or lower score for comparing the performance of mogilefs and Google FS?
4. Thread Management
A set of processes usually consist of several tasks. The multi-threaded method is that a thread is solely responsible for the operation of the entire process. Another way is to cut the process into several segments, each segment is under the responsibility of one or several threads, this method is called workbench.
The common method is multithreading. However, the workstation approach is conducive to centralized computing resources to handle heavy tasks and avoid bottlenecks. However, the defect is that data that records the intermediate state must be transmitted between different threads. In what situations is multithreading applicable and when is workbench used?
5. Scheduler
A website usually provides multiple services. Different services need to call different business logic. Some business logic can be completed on the same server, but when the business logic is complex, multiple servers need to be called for cooperation. Different services have different audiences, different traffic, different traffic periods, and different service traffic during the same period. Therefore, computing resources need to be dynamically allocated. This is the job of scheduler.
When scheduler assigns jobs to different servers, the easiest way is to start related programs installed on the server in advance. Because every program cannot be guaranteed to be perfect, when a program error occurs, the whole server should be prevented from crashing, affecting the normal operation of other operations. Do you need to use virtual machine to isolate different jobs?
6. Signal Flow and data flow
The background system of a large website is often composed of multiple servers. Data Exchange may occur between the server and the server from time to time. For example, after the Web server parses a user request, it forwards the request to an app server, after completing some work, this app server forwards intermediate data to the next app server. After the second app server completes the task, the whole task is finished and the result should be returned to the web server.
The question is, how can we let the first app server know that the intermediate result should be sent to the second app server, and the second app server know that its destination is Web server? A more efficient approach is to differentiate between data streams and control flows. The permanent channel between the server and the server is used exclusively for control flow and commands are transferred to control the transmission of data streams. The data flow does not occupy the control flow channel. The data flow channel is established only when necessary.
The control flow and data flow must be organized according to the specific business logic to optimize the design, reduce bandwidth consumption, and shorten the data transmission time.
7. Instrumentation
Whether all parts of the website background are running normally, where is the bottleneck, and where is the idle. All of these require real-time monitoring. It not only avoids the crash of the entire background system in a timely manner, but also analyzes the running rules of each part to find a way to optimize the system.
The problem is, what kind of monitoring tools should be used to minimize interference with system programs and provide valuable information?
8. anti-abuse
Websites usually face a variety of users. The vast majority of users are friendly, but it is not ruled out that a small number of users are deliberately prank. If no preventive measures are designed in advance, the improper behavior of a few malicious users will interfere with other users to enjoy normal services.
The question is, how can we prevent and stop malicious behaviors in a timely manner?
9. Exception Handling
No matter how well we were supposed to be, there would always be such an unexpected situation during actual operation. For example, the emergence of sensitive words often has no signs in advance. Therefore, when designing the system architecture, it is necessary to provide necessary tools for network management to cope with emergencies.