Large-scale Web services development skills

Source: Internet
Author: User
Tags amazon cloudfront

5th Lesson The difficulty of large-scale data processing--memory and disk

Bottleneck analysis of single Linux server

1. View Average load

Use the top and uptime commands to see the average load;

1, the average load is very low, the system throughput can not improve---------> Check the software settings are abnormal, network, host whether there is a failure

2. High average load, view CPU usage and I/O wait rate with SAR or VMSTAT command

2, confirm the CPU, I/O whether there are bottlenecks;

> If the CPU load is too high :

1, the use of SAR or the top command to confirm the user program is the bottleneck or system program problems;

2, with the PS command to see the status of the visible process and CPU usage time, further confirm the problem process;

3, after the determination process, uses the strace to carry on the tracking or uses the Oprofile to carry on the section survey;

If the elimination program is out of control, disk, memory in the ideal state, you need to increase the server, improve program logic and algorithms;

> If the I/O load is too high :

Most of the I/O requests made by the program cause excessive load or a page exchange causing frequent access to the disk, using SAR or vmstat to confirm the swap state;

If this occurs, the page exchange is caused by :

1. Use PS to confirm if a process consumes a lot of memory

2. If a program problem consumes a lot of memory, you need to improve the program;

3. If the memory is not installed, increase the memory, can not increase the memory situation, need to consider the distributed

If the page exchange is not occurring (possibly insufficient memory for caching):

1. Increase the memory if you increase the memory to enlarge the cache;

2. If adding memory still does not solve the problem, consider distributing or increasing the cache server;

Lesson 6th: The Key to extensibility

expansion of CPU load (simpler)

1, increase the same structure of the server, through the load balancer to disperse;

2, such as: Web application server, network crawler, etc.;

extension of I/O load (more complex)

1. Using the database

2. Large-scale data

Lesson 7th: Fundamentals of handling large-scale data

Three priorities for handling large-scale data:
Tips for writing Programs:

* As far as possible in memory,: To minimize the number of disk seek, can achieve distributed, effective use of local;

* Use an algorithm that can cope with increased data volume, such as: Linear search and two tree search, O (n)-->o (LOGN)

* Technology that can sometimes take advantage of data compression and search

Three prerequisites before dealing with large-scale data:

* Operating System cache
* What you must do to apply the RDBMS as a prerequisite for distribution
* Algorithms and data structures in large-scale environments

Lesson 9th: Strategies to reduce I/O load

policy to reduce I/O load with caching as a prerequisite:

* If the data size is less than physical memory, then all caches;
* If the data size is larger than the physical memory, consider data compression (using data compression, whether or not to decompress when reading from the cache, does this increase the computational load?). );
* If the size of the data is larger than the physical memory, it can be extended to more than one server. In order to disperse CPU load, simply increase the server, for decentralized I/O load, need to consider the locality;
* Consider the balance of economic costs

Linux Page Cache policy: (whenever possible, Linux will use free memory as a page cache)

1. reading data from disk
2. If the data does not exist in the cache and has free memory
3. Create a new cache
4. Replace the old cache if there is no free memory for the cache to use
5. When the process requires allocating memory, its priority is higher than the page cache;

When the server is started, do not put into production environment, because it takes time to set up the cache, if there is no cache, large-scale access caused by frequent read and write disk, may cause downtime; After startup, you will use the database file cat repeatedly to put it into the cache;

How about cat?

10th Lesson: using Local distributed
The so-called locality is locality, distributed according to the access pattern;


I understand that the access is diverted according to a certain business rule, so that a single server can save the cached data for the corresponding rules section, so who can do the application request assignment? Is LVS?

Commonly used local distributed technology is: partitioning (partition)
The simple thing is to divide a database into different machines,

* The simplest method of segmentation: The table is divided into units, such as Table A, b on machine 1, table C, D on machine 2; The partitioning principle is to look at the size of the table and the matching of the machine cache capacity; does such a segmentation mean that the tables of different machines must be weakly related and cannot have associated requirements?

* There is a segmentation method is divided from the middle of the data, that is, a table, such as the starting letter according to the ID: A-c in the Machine 1,d-f in the Machine 2;

* There is also a special segmentation method that divides data into "data islands" based on business purposes; For example, the Hatena bookmark is separated from the HTTP request's user-agent and URL, for example: The general user is assigned to Island 1, and some API requests are assigned to the island 2,google Bot, Yahoo! Such reptiles are assigned to the Island 3;

* The use of local distribution, requires the application to make the corresponding changes, while the problem is: if you need to change the granularity of segmentation, it is necessary to merge data once and then split, more trouble;

11th Lesson: Proper application of indexes----The premise of distributed MySQL application

* In the design of large data volume of the table, as close as possible, so that the record as small as possible, because the table structure is slightly wrong, the amount of data will be in GB units increment;

* Note that the processing of redundant columns in the table design process, if a table contains redundant columns, will waste storage space, if the redundant columns to another table, may save space (not necessarily, need to evaluate), but also increase the complexity of the query, so in time and space to measure the trade-offs;

* The data structure indexed in MySQL is a B-tree variant, B-tree can be adjusted by the number of nodes, the size of each node in 4KB, so that the number of disk seek and the number of nodes visited the same; the binary tree is fixed to 2 nodes, so there is no such adjustment capacity;

* Theoretically, the complexity of the B (+) tree is Olog (n), while the complexity of the linear search is O (n)

Rules for MySQL indexing:

* The columns specified in where, order by, and group by will use the index

* When is the index valid? Explicitly added index, PRIMARY key, UNIQUE constraint;

* If you want to apply indexes on multiple columns at the same time, you must use composite indexes;

* The command to confirm that the index is valid: explain

12th lesson: MySQL Distributed--a system design based on extension

MySQL's replication:

* Structure of the master/slave;
* Inquiries sent to slave, updated to master, through Orm to control;
* Slave in front of the load balancer, such as: LVS, MySQL Proxy, so as to spread the query on more than one server;

Characteristics of the Master/slave:

The query can be extended, just increase the slave server, but add the appropriate memory before adding;

Master cannot be extended, although more than 90% of Web applications are read, but if you need to expand, you need to split or replace the table to implement the method;

  • partitioning a table: Distribute the data files across different disks on the same machine or on different machines
  • Use KEY-VALUE storage structures without RDBMS, such as: Tokyo tyrant, Redis

13th Lesson: MySQL Scale-out and partitioning

the design based on the premise of partitioning :

For example, table entry and table tag are one-to-many relationships, and if you want to remove bookmarks that contain the label "Perl", you need to use a join query to associate two tables. However, if entry and tag tables are placed on different machines, MySQL will not be able to join (MySQL's federated table can be implemented), only by looking for records containing the "Perl" label, and then to Entry table to find the corresponding entry records according to the Eid. Therefore, a join query can only be used if the guarantee table is not partitioned to different machines.

Use where...in ... To avoid joins

Select URL from the entry INNER JOIN bookmark on entry.eid = bookmark.eid where bookmark.uid = 169848 limit 5;

=

Select Eid from bookmark where uid = 169848 limit 5;

The Select URL from the entry where Eid in (0,4,5,6,7);

Lesson 14th: Special-purpose indexing----processing large-scale data

question: What happens when the data size exceeds the processing power of the RDBMS?

Methods: Use batch operations to extract data from an RDBMS, establish an index server, and so on, and then let the Web application access the index server through RPC.

Lesson 30th: Cloud vs Self-building infrastructure

Amazon EC2 (Amazon Elastic Compute Cloud), is not responsible for storage, storage is responsible for the S3 (Amazon simple Storage service) services, so you have to have a script to restore the database from S3 every time you restart

Amazon S3 (Amazon simple Storage Service)

Google App Engine

Microsoft Windows Azure

Lesson 31st: Layers and extensibility

the processing capacity of a single server is approximately 1 million ~200 PV (page views)/around 4 cores cpu,8g memory

Scalability of each layer:

application Server, configuration same, not holding state, easy to expand

Data Source (database server, file server): Read distributed easy, write distributed difficult

33rd lesson: ensuring Redundancy

Application Server:

Increase the number of servers

Failover and failure recovery with load balancer

Database server:

Multi-master: This is the main method of MySQL server building this year. In this architecture, the service is usually two, consisting of the active/standby structure. One is active and the other is standby, which usually writes data only to active. Once active is down, the standby is monitored by the VRRP protocol to elevate itself to active and become the new master. The machine that was shut down was manually repaired and turned into standby, or restored to its original structure. In order to determine from the outside which is active, a virtual IP (VIP) is required, that is, the active server will be assigned a service virtual IP address in addition to the original IP address. The application server always accesses this virtual IP. This virtual IP is assigned to the new active when it is cut off. This enables the transparent switching of master.

38th Lesson: The demarcation point of the network

PC Router limit: More than 1Gbps (30pps, that is, 300,000 packets per second, calculated as 300 bytes per packet, 1Gbps)

The limit of a subnet: 500 hosts

A data center does not achieve globalization

CDS: (Content Delivery Network), the rationale is to place servers around the world, after the media files are cached, users can download from the nearest server.

such as: Amazon CloudFront

Large-scale Web services development skills

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.