Reading Notes for large-scale Web Services

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Guaranteed scalability, horizontal scaling, and vertical scaling.
Ensure redundancy. When one or more servers fail, ensure the normal operation of the system.
When hundreds of servers exist, the O & M cost increases. The O & M methods include the purpose of each server, the running status of each server, such as load, disk, security settings, and whether a fault occurs, these operations and maintenance methods must be automated, such as monitoring software and server management tools. Can they be automatically restarted after the server fails;
Both the development method and the number of developers have changed. Separated O & M and development, and collaborative development by multiple people. How should we standardize the implementation scheme, unified development language, Unified Library functions and framework,CodeSpecification, Version Management,Source codeManagement, Bug Management, how to increase manpower to ensure development efficiency improvement, implementation of development specifications, training for new people, and whether or not personnel can be upgraded to the company's development.
The memory speed is 10-times that of the hard disk.
The general bottleneck is on the CPU or Io. The bottleneck of the database is I/O, And the bottleneck of the application server is on the CPU. When looking for the bottleneck, you must locate the bottleneck. the I/O bottleneck increases memory, and the CPU bottleneck upgrades the server and extends the server.
The scale is still small, and the effect of a simple method is the best. Premature optimization is not the best solution. From the very beginning, it was too costly to establish a complete load balancing solution. However, the beginning is not a matter of consideration. A balance point must be found.
Communication is a problem when there are too many employees, and it is necessary to keep the communication records for future reference.
If the data volume is too large to be processed in the memory, if the memory is enough big data to be processed in the memory, the processing speed will be very fast.
Be sure to remember that in the future, there must be evaluation data for anything and use data to speak. Do not take it for granted.
Bottlenecks in the image include cup, Io, network bandwidth, network cable, data center, overall server configuration, and network distribution.
The expansion difficulty lies in database expansion. In addition, the bottleneck is often I/O, which is hard to solve.
Io transformation may be completed in the memory. The operating system caches pages. When the memory is used for virtual memory, page cache, and the system memory is sufficient, the page cache will be implemented. As a result, the newly added memory will not be used quickly, but the access speed will be provided. If a running machine is restarted and the cache does not exist in the later stage, if there are too many requests at once, it cannot respond quickly, and it must go to the back-end service for computing on the disk, A large number of blockages may cause system crashes. After a new server is installed, it must be preheated so that all businesses can basically run through the cache and later. When the cluster is pulled in, it can withstand the cluster pressure. Otherwise it will be crushed to death.
A large amount of updated data can be cached and then updated in batches, which may lead to data loss.
Database Io Dispersion Method ------- split (split tables, split into new databases ). Split tables by business and by data. Some columns contain a large amount of data to split tables and split tables by security. The problem with table hash and table ing is that further expansion is troublesome. At the same time, some redundancy is maintained. Reduce the number of associated queries.
Different service or request sources are used to make it accessible. For example, a request uses a load channel, baidu and Google robots use one request channel, while others use another channel. Separate requests.
By increasing the memory, I/O bottlenecks cannot be solved through distributed solutions. Distributed, which can be used only when a single machine cannot solve the problem.
Database field type storage, different field types, for a table with hundreds of millions of data, the final table size gap will be huge.
Index: B tree, B + tree, and binary tree. The index effect is that no index is a linear table O (n), and the index is (B tree Binary Search) O (logn), if the data is 40 million. When no index is available, the worst case is to search for 40 million times. For Tree B, only 25.25 times are required. The gap is so huge. It reduces the number of I/O operations and the amount of CPU transport. Composite Index, clustered index, and non-clustered index.
Database expansion, Master/Slave, one master and multiple slaves (consisting of at least four, one master and multiple slaves, one master and multiple slaves, the primary write, the read slave, and one damaged, can be used for backup and restoration, once split, it changes from 4 to 8 to 12, 12 to 16 and increases from 4 to a group); multiple masters and multiple slaves; when the traffic volume is very high, multiple masters and multiple slaves are distributed in a hierarchical manner. Split tables and hash ing. Or create a hash ing table. During early planning, more hash ing tables are reserved, which is difficult to expand due to hash ing.
Poor storage of sorted data. For example, [3, 5, 20, 21, 23, 76, 78] can be stored in the following ways: [3, 2, 15, 1, 2, 53, 1, 1], values can be obtained by accumulating the preceding number.
AlgorithmComplexity: O (1) <O (logn) <O (n) <O (nlogn) <O (n2) <O (N3 )..... <O (NK) <O (2n), where O (1) indicates the time for calculating the phone bill, which is irrelevant to the algorithm. The time is the best fixed effect;
No matter how the algorithm is optimized based on comparative sorting, it cannot be faster than O (nlogn). Therefore, the sorting algorithm can achieve the highest speed.
Time Complexity and space complexity sometimes sacrifice space for speed. Sometimes there is not enough space to sacrifice time to ensure the quick completion of the algorithm.
Data Compression. Generally, Gzip compression must be enabled on servers.
A Data Structure called trie, which stores strings efficiently with a data structure, and its tree structure can combine the common prefixes of search object data.
Three key points of web service infrastructure. 1. low cost and high efficiency. 2. Focus on the design of scalability and responsiveness. 3. The web service specifications change frequently, and the infrastructure should be flexibly adapted. the development speed should be emphasized for the infrastructure. The current web changes are too fast, and the response will be eliminated if the response is too slow.
Cloud Features: Low Price, excellent scalability (maximum advantage ). Its requirements must comply with cloud service rules. Its Io performance is poor and memory expansion is limited. Cloud services are not under your full control.
Self-advanced infrastructure: 1 flexible hardware configuration. 2. flexible response to services 3. Bottlenecks can be controlled.
The highest level of large-scale stations is the scalability of each layer.
Server redundancy guarantees that services can run continuously. An avalanche effect may occur if a server crashes.
Automatically switch after the server is damaged. After the server is replied, It can be automatically pulled into the cluster.
When multiple master databases are used, the 3rd-party judgment is introduced to ensure that the 3rd-party judgment vote for High-reliability machines. If primary database A has primary database B, and then judgment Party C, if both A and B are running normally, C will vote for a TO MAKE A a real primary database, if a damages C and votes to B, then B is the real master database. It owns listening to C, C voting to get a machine, that machine is used as a real master database. The other is used as the backup master database of the master database.
Develop a file system that meets your company's requirements and use cheap machine clusters as file systems.
Divides real machines into multiple virtual machines, so that the performance of each virtual machine is similar.
Factors affecting the stability of the system: the increase in functionality and the development of new features are very popular. The sudden increase in load and Memory leakage lead to increased load. A bug occurs only when a specific amount of data is involved. It is difficult to troubleshoot. When the data volume is too large, the processing capability is limited. User Access Mode: for example, if an image is referenced by a large site such as Google, the volume of referenced files suddenly increases, leading to increased load. The increase in data volume leads to the stability of the system operation department. Some use a large amount of data updates, which can be solved by batch update extension. Some external services used are unstable and suddenly breaks down.

Memory and hard disk faults. Nic fault.

Eliminate unstable factors, reduce SQL loads, reduce memory leaks, and control exceptions.
403 is returned for the same IP address during DoS attacks.
Stability measure. The maximum memory and CPU usage is 70%, which is generally 25%.
Automatically terminates time-consuming queries or operations.
A physical machine may have idle CPU resources and insufficient Io resources. You can create a web server virtual machine. If I/O is idle and the CPU usage is insufficient, you can create a database server. To use resources.
Virtual machines, removing physical restrictions, and changing resources dynamically. It is easier to migrate and copy virtual machine operating systems. This allows you to easily increase servers and ensure scalability.
Virtual machines cause additional overhead: the CPU is about 2% to 2%; the memory performance is 10%; the network performance is 50%; the IO performance is reduced by 5%;
Assembling cheap Application Clusters by yourself reduces costs. This book describes how to assemble machines by yourself and meet performance requirements to reduce costs.
The Network demarcation point exceeds 1 Gbps (0.3 million PPS (300 k) in terms of route performance.
More than 500 hosts have the limit of one subnet. When there are about 500 machines, the ARP table is more than 800, and its actual value can accommodate about 900, resulting in a large number of packet loss.
Globalization is the limit of a data center.
Select a CDN vendor.
If the bandwidth exceeds 10 Gbps, you must obtain the AS (Autonomous System number) and connect to IX (Internet Exchange) to exchange traffic, or use the Border Gateway Protocol (BGP. These things are relatively high-level and active. If a control error occurs, the network will be affected. Some websites in some regions cannot be accessed, resulting in regional networks and Website access failure.
Features of Web Services: low cost, high scalability, redundancy, efficiency, network stability, and powerful O & M stability.
Job Queue System: puts the client into the job, the server stores the job, and then continuously executes the job. Days are retained during the job and analyzed using logs.ProgramPerformance.
The basis for selecting a storage system: average data size, maximum size, new data frequency, update frequency, deletion frequency, and Access frequency. Provided systems: conventional database systems, keve-vlue storage (memcached), distributed file systems, and other storage systems (NFS distributed file systems ).
The reverse proxy uses Squid as the cache server. Squid FAQ: the higher the squid efficiency, the greater the impact of the fault. If the squid is damaged, the other is unable to withstand the transferred pressure, resulting in an avalanche.
A large number of days of parallel processing. Use the mapreduce computing model for processing. Hadoop is an implementation of this algorithm.
All major companies have their own distributed file storage systems.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Reading Notes for large-scale Web Services

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Reading Notes for large-scale Web Services

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support