Web Services Architecture

Source: Internet
Author: User
Tags transparent image nginx server

Server Partitioning

For sites with large access, it is necessary to split the parts of the site into separate servers. For example, separate the picture from the Web site. In general, there are several types of Web site-wide deployment on the server:

File Server: General storage System related pictures and files, to provide unified file invocation for each subsystem

Proxy Server: generally use Linux+nginx as the reverse proxy

Web server:the most commonly used Web server in. NET Iis,mono uses Nginx in general

Application Server: responsible for the provision of various business logic in the system, such as User Center, Settlement Center, payment center, etc.

Cache server: provide memcached cache service

Database server: responsible for the provision of Web site data, generally sqlserver,mysql,oracle, etc.

Calculation of bandwidth

Assuming that the site is subjected to 1 million PV traffic per day, compute bandwidth involves two metrics (peak flow and average page size) with a bandwidth of bps (bit/s).

1, assuming the peak flow is 5 times times the average flow;

2, assume that the average page size of each visit is about 100KB.

1b=8b---------------------1b/s=8b/s (1bps=8bps)

1kb=1024b-------------1kb/s=1024b/s

1MB=1024KB------------1mps=1024kb/s

1 million PV traffic is distributed evenly over a day, approximately 12 times per second, with a page size of byte (byte) and a total page size of 12*100kb=1200kb,1byte=8bit, 1200kb=9600kb,9600kb/1024 about 9Mb /s (9Mbps), our site in peak traffic must maintain normal access, the real bandwidth should be around 9m*5=45mbps.

One of the evolutionary processes of website architecture

The company has just started, the volume of business is not large, it is often possible to lease a virtual host space quotient and a database to build a most basic site

The evolution of the website architecture two additional caches

As the volume of business increased, the user's access to more and more, the site frequently open, slow, and even the database link to reach the maximum limit, this time need to do some optimization strategy for the site:

    • Reduce HTTP requests, compress css,js, picture size
    • Integrating Microsoft Ajax Minifier into VS2010 compile-time compression for JS,CSS
    • Increase page caching and increase data cache processing
    • Full parsing of cache on Cnblogs
    • Self-purchase server for IDC hosting
    • Self-purchase server can improve the level of hardware and bandwidth can be freely controlled, is generally exclusive bandwidth, compared to the shared bandwidth can support more traffic

The evolution of the Web site architecture three additions to the Web server

When the number of system visits increased again, Webserver machine pressure at the summit to a higher level, this time began to consider adding a webserver, but adding a webserver means that the two servers to establish the same site, then there will be the following problems:

How do I assign access to these two machines? Nginx

How to maintain the synchronization of state information, such as user session?

The normal scenarios are write to database, open state Server, cookie, write cache, etc.

How do I keep data cache information in sync?

Cache server

How do I upload files with these similar features to continue normal?

Unified Management with File servers

The evolution of the site architecture four-part library, sub-table, distributed cache

By increasing the Web server to enjoy a quick access to the happiness, found that the system began to slow down, after looking, found that the database write, update some of these operations of the database connection resource competition is very fierce, causing the system to become slow, how to do?

Sub-Library

Sub-table

Memcache,redis Distributed Cache

Horizontal partitioning VS Vertical Partitioning

Level

Vertical

Storage dependencies

Can span db
Can span physical machines

Can span table spaces, different physical properties
cannot be stored across DB

Storage mode

Distributed

Centralized type

Scalability

Scale out (scaling out, adding cheap equipment)

Scale up (upgrade device)

Availability of

No single point

Single point exists (DB data itself)

Price

Low

Moderate, even expensive

Application Scenarios

Web 2.0

Architecture evolution process of five Web gardens or add more webserver

After the work of the sub-Library, the pressure on the database has dropped to a relatively low, this time may be to the next bottleneck, look at the Windows performance counters found a large number of blocking requests, so you can do the Web garden or add some webserver server. In this process of adding webserver servers, there are several issues that may occur:

The soft load of an Nginx server can no longer afford a huge amount of web traffic, and it is possible to solve F5 or applications logically by using hardware load, and then spread to different soft load clusters

Some of the original state information synchronization, file sharing and other scenarios may be bottlenecks, need to be improved, perhaps this time will be based on the situation to write to meet the needs of the Web site Distributed file system, etc.

After doing this, we begin to enter an era of seemingly perfect infinity, and when website traffic increases, the solution is to constantly add webserver.

The evolution of Architecture six read-write separation and inexpensive storage solutions

By increasing the Web server to enjoy a quick access to the happiness, found that the system began to slow down, after looking, found that the database write, update some of these operations of the database connection resource competition is very fierce, causing the system to slow down, how to do, read and write separation, subscription and release

Cheap Storage Scheme NoSQL

NoSQL = not-only SQL refers to a non-relational database. With the rise of internet web2.0 website, the traditional relational database in coping with web2.0 website, especially the web2.0 pure dynamic website of ultra-large-scale and high-concurrency SNS type, has been unable to overcome, exposing a lot of difficult problems, and the non-relational database has been developed very rapidly because of its own characteristics.

NoSQL databases are used in a large number of non-transactional systems such as microblogging systems

BigTable

Mongodb

Http://tech.it168.com/topic/2011/10-1/nosqlapp/index.html

The evolution of architecture into the era of large-scale distributed applications and inexpensive server group Dream era

After this long and painful process, and finally ushered in the perfect era, and constantly increase webserver can support the increasing traffic, but the original deployment on the Webserver Web application is very large, when more than one team began to change it, Quite inconvenient, reusability is also quite bad, basically each team has done more or less duplication of things, and deployment and maintenance is also quite troublesome, because the huge application package in the N machine copy, start all need to spend a lot of time, the problem is not very good to check, Another worse situation is the likelihood of a bug on an application that causes the whole station to be unavailable, as well as other factors like tuning bad operation (because the application deployed on the machine should be done, it is impossible to do targeted tuning) and so on, according to such analysis, began to make a painful decision, will The system is split according to responsibilities, so a large distributed application is born, usually this step takes quite a long time, because there are many challenges:
1, split into a distributed after the need to provide a high-performance, stable communication framework, and need to support a variety of different communication and remote Call mode;
2, it takes a long time to split a huge application, need to do business collation and system dependency control, etc.
3, how to operate (rely on management, health management, error tracking, tuning, monitoring and alarm, etc.) good this huge distributed application.
After this step, the architecture of almost the system enters a relatively stable phase, but also can start to use a large number of inexpensive machines to support the huge amount of traffic and data, combined with this architecture and the experience of so many evolutionary processes to adopt a variety of other methods to support the increasing volume of traffic.

CDN Content Distribution Network

What is a CDN?

The full name of the CDN is the Content Delivery network, which is the contents distribution networks. The goal is to add a new layer of network architecture to the existing Internet, publish the content of the site to the "Edge" of the network closest to the user, so that users can get the content they need, solve the Internet congestion and improve the responsiveness of users to the website. From the technical comprehensive solution due to the network bandwidth is small, user access is large, dot distribution is not equal reason, to solve the user to visit the site of slow response speed of the root cause.

In narrow sense, the content sub-distribution network (CDN) is a new type of network construction, it is a network covering layer which can be specially optimized for releasing rich media in traditional IP network, and the CDN represents a network service model based on quality and order in a broad sense. Simply put, the Content Publishing network (CDN) is a strategic deployment of the overall system, including distributed storage, load balancing, network request redirection and Content Management 4 elements, while content management and global network traffic Management (traffic Management) is the core of the CDN. By judging the user's proximity and server load, the CDN ensures that the content serves the user's requests in an extremely efficient manner. In general, the content service is based on a cache server, also known as the proxy cache (surrogate), which is located at the edge of the network and is only "one hop" away from the user. At the same time, the proxy cache is a transparent image of the content provider's source server, which is typically located in the CDN service provider's Datacenter. Such architectures enable CDN service providers to provide the best possible experience to end users on behalf of their customers, content providers, who cannot tolerate any delay in request response time. According to statistics, the use of CDN technology, can handle the entire Site page 70%~95% content access, reduce the pressure on the server, improve the performance and scalability of the site.

How the CDN works

In describing the implementation principle of CDN, let us first look at the traditional non-cached service access process, in order to understand the way CDN cache access and non-cached access to the difference:

By visible, the process by which a user accesses a site that is not using a CDN cache is:

1), the user to the browser to provide the domain name to access;

2), the browser calls the domain name analytic function library to parse the domain name, in order to obtain this domain name corresponding IP address;

3), the browser uses the resulting IP address, the domain name of the service host to send data access requests;

4) The browser displays the content of the Web page according to the data returned by the domain host.

CDN's popular understanding is the website acceleration, can solve the cross-operator, across the region, the server load capacity is too low, too little bandwidth, such as the opening of the website slow and so on. Lan Homestay, Rui Jiang, blue News

Consistent hash algorithm

In a distributed architecture, the failure of a node is unavoidable, and when a node is added and removed, a large amount of hash data is invalidated and a re-hash is required. This means that the missing data is requested once in the database to be re-hashed to the corresponding server by hash (key)/server number = server number. This can be very significant for high-traffic systems.
People use consistent hash to solve this kind of problem

MORE: C # Implementations of the consistent hash algorithm (Ketamahash)

Reference:

Http://www.cnblogs.com/genson/archive/2009/10/22/1587836.html

Cdn

Web Services Architecture

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.