"Reprint" on the architecture of large ASP. NET application Systems-how to achieve high performance and scalability

Source: Internet
Author: User
Tags server memory

Http://www.cnblogs.com/mikelij/archive/2010/11/30/1892261.html

Architecture for large ASP. NET application Systems-how to achieve high performance and scalability

Brief introduction

Previous << about the architecture of a large ASP. NET application System-Architecture selection >> after writing, some of my colleagues are eager to reply, some ask questions, and hope to give some examples to illustrate, some suggestions, I hope the next one to write more detailed points And some of my colleagues put forward different views. Thank you for your participation. will continue to work hard. This paper makes an analysis of the difference between layer and tier (row). The development of each tier in the 3 tier/n tier architecture is described in detail. The distributed approach of each tier. and in order to achieve high performance, low latency, high scalability, need to take what methods and means.

About the concept of "large ASP. NET Application System"

means an ASP. NET application system that can support a large number of online users simultaneously. At the same time the number of online users to achieve a large. In fact, there is no one can be defined as a consensus, the individual thinks that if an application system can achieve 7x24 hours at the same time the number of online users not less than 5000, it should be called large-scale application system. For example, Microsoft's official website www.microsoft.com,7x24 hours are visited by people from all over the world, have access to MSDN, have a visit to Microsoft Blog, have a look at Microsoft product information, visit Microsoft Forum, and so on and so on. There are too many people visiting Microsoft's official website, far more than 5000. and MySpace. It has a total of tens of millions of users, and its number of online users is also quite alarming. The reason why it can serve a large number of users is because there is a huge system behind it.

Analysis of layer and tier

Here is an analysis of layer and tier for the comments in the previous article. The previous article mentioned that the layered (layered) architecture can only be deployed on the same service, and colleagues commented differently, saying that the layered architecture can also be deployed on multiple servers. Layer refers to the logical grouping of the various functions of an application, while tier indicates that the functions of the application are physical divisions on multiple computers. Layer is very well understood, that is, the same function of the class is logically divided into a group, such as: The data access classes are put into a piece, in the same namespace, in the same assembly, the business logic of the class is the same grouping, the groups have a unified call form. such as the business logic of the class reference data access class, call its method, get the return result. At the same time, the UI layer can invoke the class of the business logic layer. The class of the business logic layer has both the function of the service UI layer and the function of invoking the data access layer. It's a connecting layer. These layers are broken down by function. Layer is a logical division. Tier is specifically a physical division, the application of the various functions, respectively, are placed on different servers, such as UI features alone occupy some servers, business logic functions occupy additional servers. There is a server boundary between the two features, and there is a feature that is specifically responsible for the distributed invocation. If the single function logically, the tier is also a layer, but more than the traditional layer of the division of some of the layer used for distributed calls. Tier is the physical separation of the layers, and then add some responsible for the distributed call layer to form. Tier and layer are connected. In this sense, tier is a special case of layer physical separation. In the case of a layer physical separation, it can be called a layered architecture, but this is actually not accurate because tier is defined specifically for this scenario. With physical separation, it's called tier more accurate. A layer is transformed into a tier as long as it is physically separated.

From a deployment perspective, try to differentiate between the layered architecture and the 3 tier/n tier architecture. Since physically separated scenes have been defined as tier tiers, the rest is only a physical non-detached scene. So the layered architecture refers specifically to scenarios that are deployed on the same service (i.e. physically non-detached), and the 3 tier/n tier architecture refers specifically to the scenarios where each layer is physically separated. Layered architecture deployed to multiple servers, theoretically yes, but the original layer is not enough, with the server boundary, the original method call in the same process is no longer feasible, you must add some layer to do the distributed call, To let the original layers run up. After all this, found that this architecture is called layered architecture is not appropriate, it must be called 3 Tier/tier architecture is appropriate.

There is a connection between layer and tier, and the layered architecture and the 3 tier/n tier architecture can transform each other.

Overall image

As you can tell from the previous description, each tier of the application system is done by many servers. For example, the UI Tier can be dozens of servers, hundreds of servers, or even thousands of servers. The number of servers required for each tier is configured according to the actual needs. The actual need is to look at the hardware resource utilization of this tier server. such as CPU, memory, disk read and write, and so on, if quite high, you must join the new server to deploy the same tier of the same application to the new server. Allow the new server to share some of the stress. In fact, this is to enable the application to support high scalability. There is a hardware load balancer between each tier, and then the next tier service interface. The tier service is behind its service interface.

In addition to high scalability, there is also the guarantee of high performance. That is, the application must be well designed. Within each tier, you can take steps to maximize the efficiency of your application execution. Let the resources of the hardware be fully utilized. There are some policies, such as caching. Reduce the number of accesses to the database, and so on. The following is a monolithic image of a scalable ASP. NET Application System:

The processing of a request by a user on the Internet is this:

1. First the hardware load balancer is processed, a Web server is selected to respond to this request, and the request is then handed over to the server.

2. The Web server executes the requested page, and the back-end code of the page queries the cache server, that is, whether the call to the Cache service interface query already has a cache, and if so, returns the cached result directly.

3. If the Business logic service interface is not invoked in the cache, then the business logic service is invoked. When the business logic service executes, if it needs to access the database, it checks the cache for the cached database contents and, if so, uses the cached database contents for the calculation of the business logic. If there is no cache, the data provider is called to access the data.

4. Similarly, the data Access service looks at the cache and then accesses the corresponding database based on the requested data content, and if it is a read-only request, the data Access service can send the database access request to the database server that made the log copy. If it is a write request, it can be sent to the primary database server.

5. The database server executes the application's SQL request and returns the result. The data service is then returned to the business logic service.

6. The business logic service is returned to the Web server, and the Web server generates the page content to be returned to the user on the Internet.

The above process is similar to the layered architecture, except that the architecture of layered is more than a few service interfaces. Without these service interfaces, because the UI tier, the business logic tier, and the data access tier are on different servers, they simply cannot talk directly. Because they are in different. NET VMs. They must be invoked with these service interfaces to make calls to each other. The specific components of these service interfaces can be WCF, or. NET remoting, and so on. It should be said that the best choice now is WCF.

UI Tier

Technical solutions for sessionstate

In order for the application to be scalable, each tier must have a load balancing feature, which means that there is no problem with the user's request being handled by any server in the same tier. There must be a proper solution to the processing of the user session. A lot of people do not agree with the use of sessionstate, feel that sessionstate on the performance of ASP. Others wrote that the same SessionID AcquireRequestState will get a lock on the Session object before the page code, so it is easy to have a large delay, which has a significant impact on performance. Other people think the session takes up more memory on the server and requires some CPU resources to serialize and deserialize the objects in the session. Therefore, a more general view is not to use the session mechanism provided by the ASP. In fact, the use of sessionstate and do not use sessionstate have their own characteristics. It is more appropriate to make trade-offs after understanding their characteristics.

Not using sesstionstate at all

Sesstionstate is written in Web. config with <sessionstate mode= "off"/> or <pages enablesessionstate= "off"/> To prohibit sessionstate. All pages of the application will not use sessionstate. In fact, this is not comprehensive, the HTTP request processing cycle also has a system default HttpModule in processing sessionstate. You must also add a sentence in the Web. config:

<remove name= "Session"/>

The application does not use the sessionstate mechanism provided by the ASP itself, but the application requirement is that the application has a mechanism similar to the session. Like the concept of a shopping cart. Remember which products the user has chosen, and deal with the product that the user has selected when the user points the bill. If you do not use the sessionstate mechanism provided by ASP, you must implement a session mechanism yourself. For example, you can have a table in the database to record the custom session data. If the user's browser supports cookies, the cookie can be used to store a custom session ID value. This session ID value is used to query the stored session data in the database. If the user's browser does not support cookies, then the hidden fields (hidden field) can be placed on the page. This hidden field is used to store the custom session ID. You can also use parameters in the URL to put a session parameter method. The session mechanism obtained by this way is the session mechanism of its own management. Need to create the session, outdated invalidation, query session data, delete the old session, etc. are managed together.

Such a custom session mechanism stores session data in a database. Then you can not rely on a specific server. Thereby obtaining a scalable feature.

With sessionstate

Using sessionstate is the default mechanism for ASP. Asp. NET has several modes of sessionstate. Inproc,stateserver,sqlserver mode and Custom mode. The InProc does not support load balancing scenarios. Only StateServer and SQL Server mode are supported. Custom patterns are the persistence of our own session data, such as placing session data in an Oracle database or MySQL database, and custom patterns that can support load balancing. In StateServer and SQL Server mode, the data that is placed in the session must be serializable. The session mechanism of SQL Server mode is recommended. The configuration is this:

<system.web>

<sessionstate mode= "Off | InProc | StateServer | SQL Server "

Cookieless= "true | False "

timeout= "Number of minutes"

Stateconnectionstring= "Tcpip=server:port"

sqlconnectionstring= "SQL connection string"

statenetworktimeout= "Number of seconds"/>

</system.web>

After the session uses SQL Server mode, all data is serialized and stored in the SQL Server database. With this model session mechanism, the session can be handled by any UI tier server, because session data is stored in a dedicated database. If it is a session mechanism with this mode, it is better to have a dedicated database server for storing session data. Through the above arrangement, ASP. NET applications gain the ability to load balance and scale.

  After using the sessionstate of ASP. NET, the request of different pages under the same session ID will have a certain restriction. Note that the different pages under the same session ID are mentioned here. This is like a database lock mechanism. The default ASP page setting is the ability to read and write to the session object. Then if two different requests for the same session ID access two different pages, it will lock the session object and cause one request to be blocked for a long time because another request is processed. Some colleagues may find it strange how the same session ID is requested for two different pages. This is in fact related to the Iframe,frameset and Ajax techniques on the page. Contains the IFRAME, frameset page has to access the session, IFRAME or frameset inside the page also to access the session, it is possible to create a first after, are the same session ID, the following page is locked by the front page, Until the previous page has been processed, release the lock on the session to process the subsequent pages. Ajax is similar. The problem also exists. The delay caused by this default mechanism can be ignored in small, ASP. NET Applications. However, it is a problem that must be solved in large-scale ASP. To solve this problem, you can only try to reduce the scope of the session from the perspective of the application, that is, clearly determine which pages need to read and write session data. You also need to determine which pages are only required to read session data. It is also necessary to determine which pages do not need to be involved in reading or writing session data, i.e. pages unrelated to session data. Through this work, the scope of the session is determined. For pages that need to read and write sessions, you can display the page with the <% @Page enablesessionstate= "on"% >. For pages that only need to read the session, you can write <% @Page enablesessionstate= "ReadOnly"% >. For pages that do not need a session, you can write <% @Page enablesessionstate= "Off"%  >. In all pages related to an IFRAME, not all pages are read and written to the session, which avoids the delay caused by the session contention lock. The same is true of the pages involved in Ajax, as much as possible to reduce the read-write session, the latency of this session contention will be less. The lower the lock, the entire ui  tier processing power willLarger.

Technical solutions for ViewState

ViewState enables server controls to repopulate their property values in round trips, and programmers do not need to write any code. These property values include visible properties and are not visible. Visible properties such as the Text property, which are not visible, are the controlstate of some controls. ControlState is a relatively special content that is always stored in the ViewState field. Even if you use Enableviewstate= "false" to disallow the Viewstate,viewstate field or some content, this is controlstate.

Once heard a lot of people complained that viewstate big, sometimes light viewstate on hundreds of K. A page of HTML, a large part is viewstate occupied. Microsoft's article is also saying that does not need to viewstate the place to prohibit the viewstate. So it's reasonable to decide which areas of the application need viewstate. After all, viewsate also to some extent to bring the programmer some convenience. Prohibit viewstate is allowed at the level of the entire application, the level of the page, and the level of control to disallow. The entire application level is forbidden ViewState: <pages enableviewstate= "false" enableviewstatemac= "false" enableeventvalidation= "false" > </pages>, the level of the page, such as:<% @ Page enableviewstate= "false"%, the level of the control such as: <asp:datagrid enableviewstate= "false" Datasource= "..." runat= "Server"/>. After ViewState is banned, the __viewstate field in the page has been greatly reduced, but still exists. As mentioned above, the remaining content in the __viewstate field is controlstate. If you want the __viewstate field to have no content, you can override both methods of the page class:

protected override void Savepagestatetopersistencemedium (object viewState)
{
}

protected override Object LoadPageStateFromPersistenceMedium ()
{
return null;
}

This way the __viewstate field is completely free of content. Of course, we can design our own persistence ViewState content in this two methods. For example, ViewState is persisted to the cache, or persisted to SQL Server. Then the viewstate content will no longer need to be sent to the user's browser. Here are some ways to disable ViewState in some places. The following is the developer and user to decide which pages or controls need to be viewstate, or not viewstate at all. The viewstate mechanism has two sides, on the one hand, facilitates the programmer, on the other hand may affect the performance. So be careful.

Reduce the number of interactions with servers and unnecessary server-side processing

Page.IsPostBack

Page.IsPostBack can determine if there is a form submission. The processing on the first visit is not the same as the processing with the form submission. This avoids unnecessary server-side processing.

AutoPostBack Property

Many server-side controls have AutoPostBack, which can be banned.

Do more client-side data validation

User input in the browser, as far as possible first with the client JavaScript authentication processing, and so on, and then submitted to the server. This reduces the number of times the request is submitted to the server.

Ajax Request Volume Control

Ajax has a great effect, but it can reduce the number of calls to Ajax appropriately, such as the ability to merge Ajax calls.

Use Server.Transfer without Response.Redirect.

Server.Transfer occurs on the server side, and Response.Redirect occurs in the user's browser. HTTP requests more than once.

Remove unnecessary default HttpModule

If not sessionstate, do not windowsauthentication, do not passportauthentication and so on:

<remove name= "Session"/>
<remove name= "WindowsAuthentication"/>
<remove name= "Passportauthentication"/>
<remove name= "anonymousidentification"/>
<remove name= "URLAuthorization"/>
<remove name= "Fileauthorization"/>

Set processmodel

Manually set the maxWorkerThreads and Maxiothreads properties in the processmodel parameter and adjust the parameters by observing the effect. If the machine resources allow, it can be a little bit more.

Set up Web garden

As long as the server resources allow, you can set up a Web garden, more than a few worker processes on the same server. The previous process of 32-bit Windows typically consumes only 2g-3g memory (because 2G or 1 g of high addresses is used by Windows itself to assemble system files). 64-bit Windows last process can occupy a bit larger memory than 32 bits, but the server has more than 100 g of memory, can be appropriate to open several worker processes. This can increase the processing power of a single server. To set up a Web garden, you can find the corresponding application pool in IIS Manager, view the advanced properties of the application pool, and then find the maximum worker process parameters, as shown in the figure.

Cache

Asp. The main caches available in net are: page-level caching, control-level, System.Web.Caching.Cache, and distributed caches such as velocity and memcahced. Page-level caching can be used on the ASPX page <% @ OutputCache duration= "varybyparam=" "none", in the user control can be used <% @ OutputCache duration= "V" Arybyparam= "None" varybycontrol= ""%, compared to the page-level cache, more VaryByControl parameters. It must be noted that these page-level and control-level caches are stored on a particular Web server. These page-level and control-level caches are of little significance unless special settings are made on load-balanced hardware. Because these page-level and control-level caches are stored on a particular Web server, the first user's request is processed by this server, and then a page cache is available, and if the load balancer hardware is handing over the first request to another server, Then the page and the control-level cache for the first request have lost meaning. Only after the special settings have been made, the load-balanced hardware will be able to know which server the request was processing, and then continue forwarding the HTTP request to the server. Then save the page and other caches will play a corresponding role. System.Web.Caching.Cache is a good caching mechanism that can be used by programmers to cache some content. Unfortunately, it's not distributed. Its storage is limited to a specific server. Therefore, it is not supported for load balancing. To support load balancing, you need to use a distributed cache, such as velocity or memcached, where the contents of the UI tier cache can be database query results. If the session mechanism is managed by itself, the distributed cache can be stored as a session, and all the objects in the session can be stored in the distributed cache. And viewstate, if you want the customer browser not to download ViewState but also with ViewState, You can overload the Savepagestatetopersistencemedium and LoadPageStateFromPersistenceMedium methods of the page class and store viewstate in the distributed cache in this method.

Consider precompiling

Pre-compile all the ASP. You can reduce the delay caused by the ASP. NET compilation page at the first visit.

Disabling debug mode in a production environment

Production environments using release mode compilation will make the program run a little faster.

Try to avoid anomalies

Exceptions are non-normal program flow. The effect of many anomalies on performance is relatively large. Therefore, in the program more likely to detect the situation, such as to determine whether an object is empty. The same applies to other tiers.

Try to avoid locking resources

In a multi-threaded scenario, try to avoid locking resources as much as possible. Try to use private resources for each thread. The same applies to other tiers.

Compress pages and related files

For example, you can open the IIS gzip and also use a homemade HTTP module to compress the HTML,. js files of the page. Remove the carriage returns and spaces that are not displayed. Do as much compression as possible.

Business logic Tier

Business Logic Service Interface

As mentioned earlier, the service interface can be considered with WCF, remoting and other technologies. The best thing to do now is to use WCF. The reason is that WCF supports transactions and supports multiple modes of communication. Business logic services are sometimes required to be public on the Internet. So WCF can choose a Web service-based approach to communication, which allows for more external systems. If the business logic service is only used internally, you can use TCP/IP socket communication mode. This Business logic service interface is actually the back of the business logic service packaging. The Business logic service provides the methods that correspond with the corresponding interfaces.

Business logic

Control of transactions

Business logic the transaction should be controlled here. This supports transactions that the WCF interface wants to match.

Prefetching and caching

For example, page, you can take the first page when the user takes out 5 pages, cached, users can turn a few pages back to the database to no longer query. Reduce the number of queries to the database. Some of the more specific queries are stored directly in the distributed cache. It is only when the cache is not available to query the database.

Access to the database can also be a distributed call

As you can see from the above diagram, access to the database needs to be done through a distributed call. The results of a database query are passed through a custom collection of objects.

Use a custom object as a processing object for business logic

These custom objects are actually a reflection of the data in a database in memory. The business logic handles the object preferably with a custom object. Do not use datasets.

Business logic tier is best to be stateless

The tier is preferably state-independent. Business-related data is stored in the distributed cache. Server memory does not store business-related data for a long time. In this way, a request for business logic can be handled by any of the business logic tier servers, so that load balancing is done.

Long-time computational tasks are best handed out to other systems to handle in the background

Some computationally intensive tasks are best left to other systems to run in the background. Interacting with a compute-dense system is only through data files.

Data access Tier

Data Access Service Interface

Similar to the Business logic service interface, the data Access service interface can be considered with WCF, remoting and other technologies. The best thing to do now is to use WCF. The reason is that WCF supports transactions and supports multiple modes of communication. You can choose the communication mode based on Web service or TCP/IP socket. This data access service interface is actually the wrapper behind the data access service.

Data access

Support for a transaction

As mentioned earlier, business logic controls transactions, and data access tier is only part of a transaction that is controlled as a business logic. There are many database operations in the data access tier, such as queries, updates, and so on. It is recommended that all database operations be implemented with stored procedures. These database operations are part of a business logic-controlled transaction. Do not implement business logic in stored procedures. These database operations are simply the task of completing a database query or storing data to a database for the business logic service. So do not implement any business logic content in a stored procedure or data access tier.

Support for database read/write separation

As shown in the previous illustration, the database has read-only mode. Partial-read requests can be diverted to a read-only database server. Only write requests are streamed to the primary database server. This requires different connections to be supported separately.

Management of connection Pools

The number of connections allowed per database server is certain. A database connection for the data Access service needs to be managed. Manage each data Access Service server connection pool.

Use SqlDataReader when reading.

When reading data, SqlDataReader can be used to read fast forward-only data streams.

Cache

Caches the contents of the database access to the distributed cache server.

Design and arrangement of database

Read/write separation

The primary database server is the clustered database server. SQL Server r2/windows Server 2008 supports clusters of up to 16 servers. You can set up some read-only database servers and use log replication to copy all logs from the primary database to a read-only database server. Then the read-only mode database server content can remain consistent with the primary database server. These read-only database servers can be used to share the stress of reading.

Separation of library Tables

To store some data in multiple databases from the perspective of an application. MySpace, for example, has more than 70 million users, and it stores every 1 million users in a database. This makes each database a lot smaller. The query is relatively fast, but the program is designed to be a little more complicated. Separate databases can be placed on different servers or on the same server. Please decide according to the actual situation.

Design of the table

3NF, BCNF is sure to be achieved. It's not much to say. The main want to talk about the clustered index. The clustered index of a table is a critical index. You need to consider what the most queries look like from an application perspective, and then design the clustered index according to the most frequently used queries. Generally, a clustered index requires a field with a short, basic data type. Fields such as integers, fixed-length text, dates, and so on are clustered indexes. And it has one-way increment, such as date, self-increment field. The design of a good clustered index is very helpful for the performance improvement of the most frequent queries, and it is helpful for inserting and updating. Inserting a new record at the end of a physical table record causes the disk IO to be small, and the update can be quickly found and updated by index. You also have to consider the efficiency of the deletion. If possible, try not to delete the records, but only the records that need to be deleted are deleted.

In addition to the clustered index, there are normal indexes, and a suitable generic index is also helpful for the performance of the query. or analyze the application of possible queries, which can be analyzed by those queries that prioritize, which fields are used primarily as search criteria. The normal index can then be built appropriately. These clustered indexes and normal indexes are helpful for the performance of the query.

Create a table partition

The records of a table are stored on different data files according to certain rules. The fields that can be partitioned are also basic types. such as date, text and so on. The IO of the table that creates the partition can be read and written by multiple threads simultaneously to different data files. Can be improved on Io.

Rational Use of views

Creating a certain number of attempts can help with query performance.

The less distributed calls the better?

Previous post << architecture-architecture choices for large-scale ASP. NET Application Systems >> the idea that the fewer distributed calls the better. Here we can say. If there is only one server, compared with distributed calls and non-distributed calls, the distributed calls are certainly slower than non-distributed calls, because the distributed calls have more intermediate interface processing. But are non-distributed calls able to support so many simultaneous access? Can a non-distributed call send a user's request to any server for processing without a problem? If there is a problem with a server, will the user on this server lose his/her session and data? Let's see.

Of course, there is the possibility that distributed calls are made in some parts of the system, and in other places non-distributed calls are used. For example, there is no distributed call between the business logic service and the data Access service. So the whole system diagram is like this:

This does not have its advantages and disadvantages, the advantage is that the business logic calls data access can be faster than all distributed, the disadvantage is that the business logic server to a certain extent, you will find that the database connection can no longer increase, and the unified scheduling database connection is very difficult. Is there a high degree of coupling between business logic and data access?

Conclusion

For large ASP. NET, the first thing is to ensure load balance and scalability, and then to maximize the performance of each server. To maximize the service capabilities of the entire system, you need to use all the means of the software hardware. Here is just a few aspects, not comprehensive enough.

"Reprint" on the architecture of large ASP. NET application Systems-how to achieve high performance and scalability

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.