Architecture of large-scale asp.net application systems-how to achieve high performance and high scalability

Source: Internet
Author: User

Introduction

The previous article <about the architecture of large-scale asp.net Application System-architecture selection> after writing, some colleagues enthusiastically replied, some asked questions, and hoped to give some examples to illustrate; some of them provide suggestions and hope to write more details in the next article. Some colleagues have different ideas. Thank you for your participation. Will continue to work. This article analyzes the differences between Layer and Tier. The development of each Tier in the 3tier/ntier architecture is also described in detail. Distributed mode of each Tier. What methods and measures are required to achieve high performance, low latency, and high scalability.

The concept of "large asp.net Application System"

This is an asp.net application system that supports a large number of simultaneous online users. The number of concurrent online users is large. In fact, there is no definition that can be used as a consensus. I personally think that if an application system can achieve 5000 concurrent online users in hours, it should be called a large application system. For example, Microsoft's official website www.microsoft.com has been visited by people from around the world for 7x24 hours, including MSDN, Microsoft blog, and Microsoft product information, they visited Microsoft forums and so on. At the same time, too many people visit Microsoft's official website, far more than 5000. And Myspace. It has tens of millions of users, and its concurrent online users are also amazing. The reason why it can serve a large number of users is that it is supported by a large system.

Identification of Layer and Tier

Here, we will analyze Layer and Tier based on the previous comments. The previous article mentioned that the Layered (Layered) architecture can only be deployed on the same service. Some colleagues put forward different opinions in the comments, saying that the Layered architecture can also be deployed on multiple servers. Layer refers to the logical grouping of each function of an application, while Tier indicates that each function of an application is physically divided on multiple computers. Layers are well understood, that is, classes with the same functions are logically divided into a group. For example, all the data access classes are stored in one group. In the same namespace and in the same set of programs, business logic classes are also grouped, and each group has a unified form of calling. For example, a business logic class references a data access class and calls its method to obtain the returned result. At the same time, the UI Layer can call the business logic layer class. The business logic layer class provides both the functions of the Service UI Layer and the functions of calling the data access layer. Is an upper and lower Layer. These layers are all divided by function. Layer is a logical division. Tier refers to physical division. The functions of an application are placed on different servers. For example, the UI functions occupy some servers separately, and the business logic functions occupy other servers. There is a server boundary between the two functional components, so there are functional components dedicated to distributed calls. In terms of functional logic, there are also layers in Tier, but there are some more layers for Distributed calls than the traditional Layer division. Tier is formed by adding some layers responsible for Distributed calls after physical separation of layers. Tier is associated with Layer. In this sense, Tier is a special case for Layer physical separation. In the case of Layer physical separation, it can be called the Layered architecture, but in fact this is not accurate, because Tier is specifically defined for this scenario. With physical separation, Tier is more accurate. Layer is converted to Tier as long as it is physically separated.

From the deployment perspective, we try to distinguish the Layered architecture from the 3tier/ntier architecture. Because the scenario of physical separation has been defined as a Tier, the rest can only be the scenario of physical separation. Therefore, the Layered architecture refers to the scenario deployed on the same service (that is, physical isolation is not performed), and the 3 Tier/N Tier architecture refers to the scenario of physical separation of each Layer. The Layered architecture can be deployed on multiple servers theoretically, but it is not enough to rely on the original Layer. With the boundaries of the server, the method calls in the same process are no longer feasible. You must add some new layers for Distributed calls to run the original layers. After all this is done, we find that the Layered architecture is not suitable. It must be called a 3 Tier/Tier architecture.

There is a connection between Layer and Tier. the Layered architecture and the 3tier/N Tier architecture can be converted to each other.

Overall Image

We can see from the previous descriptions that each Tier of the application system is completed by many servers. For example, the UI Tier can be dozens of servers, hundreds of servers, or even thousands of servers. The number of servers required for each Tier is configured based on actual needs. The actual requirement is to check the hardware resource utilization of this Tier server. For example, CPU, memory, disk read/write, and so on. If it is very high, you must add a new server to deploy the same application of this Tier to the new server. So that new servers can share some pressure. In fact, this is to enable applications to support high scalability. There is a hardware load balancing between each Tier, and then it is the service interface of the next Tier. It is the service of this Tier after its service interface.

In addition to high scalability, there are also ways to ensure high performance. That is, the application must be well designed. Within each Tier, you can take some measures to make the application execution more efficient. Make full use of hardware resources. This has some policies, such as caching. Reduce the number of visits to the database, and so on. The following figure shows the overall image of a scalable asp.net application system:

The process of a user's request on the internet is as follows:

1. First, the hardware Server Load balancer processes the request, selects a Web server to respond to the request, and then delivers the request to the server.

2. the Web server executes the requested page. The backend code of the page first queries the cache server, that is, the cache service interface is called to check whether there is any cache. If yes, the cache result is directly returned.

3. If the cache does not exist, call the business logic service interface to call the business logic service. When the business logic service is executed, if you need to access the database, it will first check whether there is cached database content in the cache. If so, it will use the cached database content for business logic computing. If no cache is available, the data access interface is called to access the data.

4. similarly, the Data Access Service will view the cache and then access the corresponding database based on the required data content. If it is a read-only request, data Access Service can send database access requests to the database server for log replication. Write requests can be sent to the master database server.

5. The database server executes the SQL request of the application and returns the result. Then, the data service is returned to the business logic service.

6. the business logic service is then returned to the Web server, which generates page content and returns it to users on the Internet.

The above process is similar to the Layered architecture, except that the Layered architecture goes through several service interfaces. Without these service interfaces, because the UI Tier, business logic Tier, and data access Tier are on different servers, they cannot directly talk to each other. Because they are in different. net VMS. They must use these service interfaces to call each other. The specific composition technology of these service interfaces can be WCF or. net remoting. It should be said that the best choice currently is WCF.

 

UI Tier

Technical solution for SessionState

To make the application scalable, each Tier must have the load balancing feature, that is to say, there will be no problem if user requests are handled by any server in the same Tier. There must be a proper solution for processing user sessions. Many people disagree with SessionState, and think SessionState has a great impact on the performance of ASP. NET applications. Some people wrote articles saying that the AcquireRequestState of the same SessionID will get the lock on the Session object before the Page code, so it is easy to have a large delay, and it has a great impact on performance. Others think that the Session occupies a large amount of server memory, and some CPU resources are required to serialize and deserialize the objects in the Session. Therefore, the general idea is not to use the Session mechanism provided by ASP. NET itself. In fact, both SessionState and SessionState have their own characteristics. It is more appropriate to consider the features and then make trade-offs.

SesstionState is not used at all

If SesstionState is not used at all, the sessionState is disabled by writing <SessionState mode = "Off"/> or <Pages enableSessionState = "Off"/> in Web. config. SessionState is not used for all pages of the entire application. In fact, this is not comprehensive. in the http request processing cycle, another default system httpmodule is processing the SessionState. You must also add a sentence in Web. config:

<HttpModules>
<Remove name = "Session"/>
</HttpModules>

The SessionState mechanism provided by ASP. NET is not used in applications, but the application needs to have a Session-like mechanism. For example, the shopping cart concept. Remember the items you have selected and the items you have selected will be processed only when you have clicked the order. If you do not need the SessionState mechanism provided by ASP. NET itself, you must implement a Session mechanism by yourself. For example, you can have a table in the database to record custom Session data. If your browser supports cookies, you can use these cookies to store custom Session ID values. This Session ID value is used to query stored Session data in the database. If your browser does not support cookies, you can place hidden fields on the page ). This hidden field is used to store custom Session IDs. You can also use the URL parameter to place a Session parameter. The obtained Session mechanism is a self-managed Session mechanism. You need to manage the creation, expiration, query Session data, and deletion of old sessions.

This custom Session mechanism stores Session data in the database. Then you can not rely on a specific server. To achieve scalable features.

SessionState

SessionState is the default ASP. NET mechanism. ASP. NET SessionState has several modes. InProc, StateServer, SqlServer mode, and custom mode. InProc does not support Server Load balancer scenarios. Only StateServer and SqlServer modes are supported. The custom mode is used to realize Session data persistence. For example, you can store Session data in an Oracle or MySql database. The custom mode also supports Server Load balancer. In StateServer and SqlServer mode, data in the Session must be serialized. We recommend that you use the Session mechanism in SqlServer mode. The configuration is as follows:

<System. web>

<SessionState mode = "Off | InProc | StateServer | SQLServer"

Cookieless = "true | false"

Timeout = "number of minutes"

StateConnectionString = "tcpip = server: port"

SqlConnectionString = "SQL connection string"

StateNetworkTimeout = "number of seconds"/>

</System. web>

After the Session adopts the SqlServer mode, all data is serialized and stored in the SqlServer database. In this mode, sessions can be processed by any server with a UI Tier, because Session data is stored in a dedicated database. If this mode is used, it is best to have a dedicated database server for storing Session data. Through the above arrangement, ASP. NET applications can achieve load balancing and scalability.

After ASP. NET SessionState is used, different page requests under the same Session ID will be subject to certain restrictions. Note that different pages under the same Session ID are described here. This is like the database lock mechanism. The default ASP page settings allow you to read and write Session objects. If two different requests of the same Session ID access two different pages, the Session object will be locked and one request will be blocked for a long time, this is because the processing of another request is complete. Some colleagues may wonder how the same Session ID requests two different pages. In fact, this is related to the iframe, frameset, and AJAX technologies on the page. If the page containing iframe and frameset already needs to access the Session, and the page in iframe or frameset also needs to access the Session, it may result in the same Session ID first and foremost, the subsequent pages are locked by the previous pages until the previous pages are completed and the Session lock is released. AJAX is similar. This problem also exists. The latency caused by this default mechanism can be ignored in small ASP. NET applications. However, it is necessary to solve problems in large ASP. NET applications. To solve this problem, we can only try our best to reduce the range of sessions to be written from the application perspective, that is, to determine which pages need to read and write Session data. You also need to determine which pages only need to read Session data. In addition, you also need to determine which pages do not need to be involved in reading or writing Session data, that is, pages irrelevant to Session data. By doing this, the range of sessions is determined. For pages that require reading and writing sessions, <% @ Page enableSessionState = "On" %> can be displayed. For pages that only need to read sessions, you can write <% @ Page enableSessionState = "ReadOnly" %>. For pages that do not require a Session, enter <% @ Page enableSessionState = "Off" %>. In all pages related to an iframe, do not read and write sessions on all pages. This avoids the latency caused by Session lock contention. The same is true for the pages involved in AJAX. Reduce the read/write sessions as much as possible, and the latency of such Session contention locks will be less. The fewer locks, the greater the processing capability of the entire UI Tier.

 

Technical solution for ViewState

ViewState allows the server controls to re-fill their attribute values during the round-trip, and programmers do not need to write any code. These attribute values include both visible and invisible. Visible properties, such as Text properties, are invisible to the ControlState of some controls. ControlState is a special content, which is always stored in the ViewState field. EnableViewState = "false" is used to disable ViewState. The ViewState field still contains some content, which is ControlState.

I have heard many people complain that the ViewState is large, and the ViewState is several hundred kb in time. A large part of the HTML of a page is occupied by ViewState. Microsoft's article also says that ViewState is forbidden when ViewState is not required. Therefore, it is reasonable to decide where the application needs ViewState. After all, ViewSate also brings convenience to programmers to some extent. Prohibiting ViewState is forbidden at the application level, page level, and control level. ViewState is disabled for the entire application: <pages enableViewState = "false" enableViewStateMac = "false" enableEventValidation = "false"> </pages>. The page level is as follows: <% @ Page EnableViewState = "false" %>. The control level is as follows: <asp: datagrid EnableViewState = "false" datasource = "... "runat =" server "/>. After ViewState is disabled, the _ ViewState field in the page is greatly reduced, but it still exists. As mentioned above, the rest of the __viewstate field is ControlState. If you want the _ ViewState field to have no content, you can rewrite the following two methods of the Page class:

Protected override void SavePageStateToPersistenceMedium (object viewState)
{
}

Protected override object LoadPageStateFromPersistenceMedium ()
{
Return null;
}

In this way, the _ ViewState field has no content. Of course, we can design our own persistent ViewState content solution in these two methods. For example, persistence of ViewState to the cache or persistence to SqlServer. The ViewState content will no longer need to be sent to the user's browser. The preceding describes how to disable ViewState in some places. In the following, developers and users decide which pages or controls need ViewState, or do not need ViewState at all. The ViewState mechanism has two sides. On the one hand, it facilitates programmers and on the other hand, it may affect performance. So be careful.

 

Reduces interactions with servers and unnecessary server-side Processing

Page. IsPostBack

Page. IsPostBack can be used to determine whether a Form is submitted. The processing for the first access is different from that for the Form submission. This avoids unnecessary server-side processing.

AutoPostBack attributes

Many server-side controls have AutoPostBack, which can be disabled.

Perform more client data verification

The user's input in the browser should be verified and processed using the client JavaScript as much as possible, and then submitted to the server after it passes. This reduces the number of requests submitted to the server.

Control the AJAX Request volume

AJAX brings a dazzling effect, but it can reduce the number of AJAX calls, for example, whether AJAX calls can be combined.

Use Server. Transfer without Response. Redirect

Server. Transfer occurs on the Server, while Response. Redirect occurs in the user's browser. There will be one more HTTP request.

 

Remove unnecessary default httpModule

For example, do not use SessionState, WindowsAuthentication, or PassportAuthentication:

<HttpModules>
<Remove name = "Session"/>
<Remove name = "WindowsAuthentication"/>
<Remove name = "PassportAuthentication"/>
<Remove name = "AnonymousIdentification"/>
<Remove name = "UrlAuthorization"/>
<Remove name = "FileAuthorization"/>
</HttpModules>

 

Set processModel

Manually set the MaxWorkerThreads and MaxIOThreads attributes in the processModel parameter, and adjust the parameters by observing the effect. If the machine resources are allowed, you can add more points.

 

Set Web garden

As long as the server resources permit, you can establish a Web garden and open several worker processes on the same server. A 32-bit Windows process can only occupy 2G-3G memory (because the 2G or 1g address is used by Windows to assemble system files ). On 64-bit Windows, a process can occupy a relatively 32-bit memory, but the server has more than 100 GB of memory. You can open several worker processes as appropriate. This increases the processing capability of a single server. To set Web garden, you can first find the corresponding application pool in the IIS manager, view the Advanced properties of the application pool, and then find the maximum worker process parameter, as shown in the figure.

Cache

The available Cache in ASP. NET mainly includes page-level Cache, control-level Cache, System. Web. Caching. Cache, and distributed Cache such as Velocity and memcahced. Page-level cache can be used on the ASPX page with <% @ OutputCache Duration = "10" VaryByParam = "none" %>, in the user control, you can use <% @ OutputCache Duration = "10" VaryByParam = "none" VaryByControl = "" %>. The VaryByControl parameter is more than the page-level cache. It must be noted that these page-level and control-level caches are stored on specific Web servers. Unless you make special settings on the server Load balancer hardware, these page-level and control-level caches are of little significance. Because these page-level and control-level caches are stored on specific Web servers, the first user request is processed by this server, and then the page cache is available, if the server Load balancer hardware submits requests after the first time to other servers for processing, the page and control-level cache for the first request will be meaningless. Only after special settings are made can the Server Load balancer hardware know which server the request was processed and continue to forward HTTP requests to the server. Then the saved pages and other caches will play a corresponding role. System. Web. Caching. Cache is a good Caching mechanism that can be used by programmers to Cache some content. Unfortunately, it is not distributed. Its storage is limited to a specific server. Therefore, it does not support Server Load balancer. To support Server Load balancer, you must use distributed caches such as Velocity or memcached. The content cached in the UI Tier can be the database query results. If you manage the Session mechanism by yourself, you can use the distributed cache as the Session storage, and all the objects in the Session can be stored in the distributed cache. There is also ViewState. If you want the client browser not to download ViewState but to use ViewState again, you can reload the SavePageStateToPersistenceMedium and LoadPageStateFromPersistenceMedium methods of the Page class, and store the ViewState in Distributed caching.

 

Pre-compilation considerations

Compile all ASP. NET pages in advance. This reduces the latency caused by ASP. NET compiling pages during the first access.

Disable debugging mode in production environment

The production environment uses the Release mode for compilation, which will make the program run a little faster.

Avoid exceptions whenever possible

Abnormal Program control flow. Exceptions have a significant impact on performance. Therefore, the program checks possible conditions, such as determining whether an object is empty. This applies to other Tier.

Avoid resource lock whenever possible

In multi-threaded scenarios, try to avoid locking resources. Try to use private resources for each thread. This applies to other Tier.

Compress pages and Related Files

For example, you can open gzip of IIS and use a self-made HTTP module to compress HTML and. js files on the page. Remove the undisplayed carriage return and space. Compress as much data as possible.

 

Business Logic Tier

Business Logic Service Interface

As mentioned above, you can consider using technologies such as WCF and Remoting for service interfaces. Currently, it is best to use WCF. The reason is that WCF supports transactions and multiple communication modes. Business Logic services must be made public on the Internet. Therefore, you can use a Web service-based communication method for WCF, which supports many external systems. If the business logic service is only used internally, you can use the TCP/IP socket communication mode. This business logic service interface is actually the packaging of the business logic service. The methods provided by the business logic service use corresponding interfaces.

Business Logic

Transaction Control

The business logic should be controlled here. This matches the scenario where the WCF interface supports transactions.

Prefetch and Cache

For example, when a user retrieves the first page, the user can fetch 5 pages and cache them. When the user goes back several pages, the user can no longer query the database. Reduce the number of queries to the database. Some data with a large amount of queries are directly stored in the distributed cache. The database is queried only when the cache does not exist.

Database Access can also be called in a distributed manner.

As shown in the figure above, database access also needs to be completed through distributed calls. The database query result is passed through the custom object set.

Use custom objects as the processing objects of business logic

These custom objects are actually the reflection of data in the memory of a database. It is best to use custom objects to process business logic. Do not use DataSet.

The business logic Tier should be stateless

This Tier is preferably State independent. Business-related data is stored in the distributed cache. The server memory does not store business-related data for a long time. In this way, a request to the business logic can be processed by any server with the business logic Tier, so as to achieve load balancing.

It is best to hand over computing tasks for a long time to other systems for processing in the background.

Some Computing-intensive tasks should be handed over to other systems for running in the background. Interaction with computing-intensive systems only involves data files.

 

Data Access Tier

Data Access Service Interface

Similar to business logic service interfaces, data access service interfaces can be considered using technologies such as WCF and Remoting. Currently, it is best to use WCF. The reason is that WCF supports transactions and multiple communication modes. You can use a Web service-based communication method or a TCP/IP socket communication method. This data access service interface is actually the packaging of the data access service.

Data Access

Transaction support

As mentioned above, the business logic controls transactions, and the data access Tier is only part of the transactions controlled by the business logic. There are many database operations such as query and update in Data Access Tier. We recommend that you use stored procedures for all database operations. These database operations are part of the transactions controlled by the business logic. Do not implement business logic in stored procedures. All these database operations are the task of querying or storing data in the database for the business logic service. Therefore, do not implement any business logic content in the stored procedure or data access Tier.

 

Database read/write splitting support

As shown in the preceding figure, the database has read-only mode. Some read requests can be distributed to the database server in read-only mode. Only write requests are routed to the master database server. This requires that different connections are supported.

Connection Pool Management

The number of connections allowed by each database server is certain. You need to manage the database connections for data access services. Manage the connection pool of each data access server.

Use SqlDataReader during reading

When reading data, you can use SqlDataReader to read the fast incoming data stream.

Cache

Cache the content obtained by database access to the distributed cache server.

 

Database Design and Arrangement

Read/write splitting

The primary database server is the database server of the cluster. SqlServer 2008 R2/Windows Server 2008 supports clusters with a maximum of 16 servers. You can set up some database servers in read-only mode and copy all the logs of the primary database to the database server in read-only mode. In read-only mode, the content of the database server can be consistent with that of the primary database server. These read-only database servers can be used to share the read pressure.

Database and table separation

From the application perspective, some data is stored in multiple databases. For example, Myspace has more than 70 million users and stores every 1 million users in a database. In this way, each database is much smaller. The query is faster, but the program is designed to be more complex. Separate databases can be stored on different servers or on the same server. Please decide based on the actual situation.

Table Design

3NF and BCNF must be achieved. That's not much to say. This article mainly describes clustered indexes. The clustered index of a table is a key index. From the application perspective, you need to consider what the most queries are, and then design the clustered index based on the most frequently used queries. Generally, fields of the basic data type must be selected for clustered indexes. For example, an integer, a fixed-length text, or a date field is used as a clustered index field. It also has the one-way incremental feature, such as date and auto-increment fields. The good design of clustered index is helpful for the performance improvement of the most frequent queries, and it is helpful for insertion and update. During insertion, a new record is added at the end of the physical Table Record, resulting in a small disk IO. During the update, you can quickly find and update the record according to the index. You must also consider the deletion efficiency. If possible, do not delete the record. Only Delete the record to be deleted.

In addition to clustered indexes, there are also common indexes. suitable common indexes are also helpful for query performance. You can analyze the queries that are given priority. The fields used in these queries are used as search conditions. Then you can create a normal index as appropriate. These clustered indexes and common indexes are helpful for query performance.

Create Table partitions

Table records are stored on different data files according to certain rules. Partitioned fields are also basic types. Such as date and text. I/O of the table for partition creation can be read and written by multiple threads at the same time. I/O can be improved.

Rational use of views

Creating a certain number of attempts can help query performance.

 

The fewer distributed calls, the better?

In the previous article, <about the architecture of large-scale asp.net Application System-architecture selection> some colleagues put forward the viewpoint that the fewer distributed calls, the better. Here we can talk about it. If only one server is used, distributed calls are slower than non-distributed calls because distributed calls require more intermediate interfaces. But can non-distributed calls support simultaneous access by so many people? Can non-distributed calls handle user requests on any server without any problems? If a server encounters a problem, will the user on this server lose his/her sessions and data? Let's see.

Of course, there is also a possibility that distributed calls are used in some parts of the system, and non-distributed calls are used in other parts. For example, a distributed call is not required between the business logic service and the data access service. The figure of the entire system is as follows:

This method has its advantages and disadvantages. The advantage is that the data access to business logic calls can be faster than that of all distributed systems. The disadvantage is that the number of business logic servers reaches a certain level, you will find that the database connection cannot be increased, but it is difficult to schedule the database connection in a unified manner. Is business logic highly coupled with data access?

 

Conclusion

For large ASP. NET, Server Load balancer and scalability should be guaranteed first, and the performance of each server should be maximized. To maximize the service capability of the entire system, all means of using software and hardware are required. Here we only talk about some aspects, which are not comprehensive enough.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.