Optimize ORM Performance

Source: Internet
Author: User
Some developers refuse to use the object relationship ing (ORM) technology because of their poor performance experience. Like any form of abstraction, using an ORM framework costs some additional overhead, but in fact, the performance of using properly tuned orm and handwritten native data access code is still quite good. More importantly, a good ORM framework is easier to tune and optimize performance, and it is much more difficult to tune the performance of handwritten native data access code.

The example in this article is based on Lightspeed orm of Mindscape. We will discuss common problems and solutions in combination with the example.

N + 1 Problems

Let's take a look at the expiration order list in the Web application, which helps us understand the problem to be discussed. Suppose we want to view not only the order, but also the customer information of each order. If we haven't done in-depth analysis, we may write this Code:

var overdues = unitOfWork.Orders.Where(o => o.DueDate < today); foreach (var o in overdues) // 1 {     var customer = o.Customer; // 2     DisplayOverdueOrderInfo(o.Reference, customer.Name);  }

This Code hides the so-called n + 1 problem. To obtain the Order List (Note 1 in the Code), you need to perform a database query operation. The Code then obtains the customer information corresponding to each order in the list, and the database query is required for each order! Therefore, if there are a total of 100 expired orders, the code will have to execute 101 database queries-1st for loading the set of expired orders, and 100 for loading each order customer. In general, if there are N orders, you have to execute n + 1 database query-this is the origin of the n + 1 problem name.

Obviously, this is quite slow and inefficient. We can solve this performance problem through the eager loading technology. If we can load all associated customers as part of the order query operation, during the same database access, then, the access to customer information is only about accessing the object attributes-no need to query the database, so there is no n + 1 problem.

With Lightspeed, We can pre-load the associated data by setting eagerloading to true (or, applying eagerloadattriing to a handwritten business entity. When Lightspeed queries objects that are pre-loaded with an association, in addition to the entity itself, it also generates corresponding SQL statements to query the associated entities.

(Click an image to view the large image)

In the preceding example. when we query order entities, Lightspeed will generate SQL statements for Loading Order and customer entities, and process all statements in the same batch. In this way, you can reduce the number of database accesses to 1 as long as you make some changes.

Orm ing between orm and native Data Access Code

In general, this shows why the ORM framework has the advantage of performance tuning. Assume that a handwritten SQL statement is used on the expired order page and the data in the database access layer is manually copied to the object. When the n + 1 problem occurs, you must not only update the SQL statement, you also need to update your ing code to process multiple result sets and manage object relationships. For our simple example, this workload is not much, but it is not the case if the page needs to read data from many database tables. It is much more complicated than modifying an option and applying an attribute!

Lazy Loading)

The preceding order page has another potential problem. Let's assume that the customer entity has a photo attribute that contains large images (if you do a requirement investigation on the sales department, you will find that this is reasonable ). The expired order page does not need to access the customer. Photo attribute, but the photo and other customer object attributes will be loaded. If the Photo size is large, it will consume a lot of memory and take a long time to extract all the photo data from the database-and the last time is wasted.

The solution to the preceding problem is to delay the loading of the photo attribute. Specifically, data is loaded only when the attribute is accessed, rather than when the customer object is loaded. Because the expiration order page does not access the photo attribute, it does not load unwanted images. Other pages that do need photos, such as the customer profile page, can still directly access the photo attribute.

There is no simple identifier that can be used to set attributes that enable the delayed loading mechanism. However, you can identify a property as part of named aggregate (by entering the name in the aggregate setting of the property). This property is delayed by default. We will discuss this technology in detail in the next section.

(Click an image to view the large image)

If we set the aggregate of the photo attribute to "withphoto", the customer photos will not be loaded on the expired order page. This avoids memory waste and reduces the data loading volume, this increases the page rendering speed.
Named aggregates (primary des)

Named aggregates (primary des)

The above solution to the n + 1 problem and heavyweight Attribute Problem has made our expired order page much more agile now. However, the above solution may have a negative impact on other pages of the website. Let's consider the order details page: Because order. Customer is already preloaded, the Order details page will be dragged down by the customer entity object it does not need. In this case, it seems that no matter whether we pre-load the customer or not, some pages will not have high performance!

Ideally, order. Customer association should be preloaded on the expired order list page, and delayed loading should be performed on the Order details page. We can achieve this by making the customer attribute a part of named aggregate.

Named aggregate can identify the parts of the page object. A named aggregate consists of object associations and attributes pre-loaded by conditions. If a query requires pre-loading, It is pre-loaded; otherwise, loading is delayed. (Named aggregate is the term used by Lightspeed. Some ORM frameworks provide similar features named mongodes ).

To make order. Customer A Part Of named aggregate, we set eager loading back to false, which enables the Order details page to run efficiently. Then, to make the expired order list page run efficiently, we add "withcustomer" to the aggregate box of order. Customer.

(Click an image to view the large image)

Now let's modify the expiration order list page, and specify withcustomer aggregate to the order LINQ query. The implementation method is simple. You only need to add the withaggregate method call to the LINQ query statement:

var overdues = unitOfWork.Orders                          .Where(o => o.DueDate < today)                          .WithAggregate("WithCustomer");

This method also applies to non-object Association attributes of objects. Recall that in order to delay loading the customer. Photo attribute, we have made it part of "withphoto" aggregate, but this is not efficient on the customer profile page that requires photos. However, you only need to add the withaggregate ("withphoto") method to the customer query on the customer overview page to make the call more efficient.

Named aggregate gives you the freedom to control performance, and you don't have to worry about the complicated details behind a simple string setting. You can adjust aggregate slightly on heavyweight or high-traffic pages as needed to greatly improve performance.

Batch

Let's focus on the order Input page. An order includes not only order-level attributes such as reference numbers, but also order details. When a user submits data on the order Input page, the application needs to create order entities and several orderline entities, and then insert data from all entities to the database.

The potential problem is similar to the above n + 1 problem (only the data flow direction is different): If there are 100 order details, you have to perform 101 database insert operations. Of course we do not want to access the database for 101 times!

Lightspeed uses batch processing to solve this problem. The process is as follows: Unlike the general execution of insert (or update or delete) as a separate command, Lightspeed divides the ten commands into one group and then executes them in batches. So in general, Lightspeed's database access times are only one tenth of the average silly Method for large data volume update operations.

Surprisingly, we don't have to make any effort to enable batch update. Lightspeed quantifies cud operations by default, so it makes the order Input page fast and persistent data.

Level 1 Cache

Now let's take a look at the performance problems on the pages related to users and their permissions. Assume that there is a user entity, which is related to permissions. this entity has attributes such as the user name, and also has a personalized setting attribute that tells the Application Component how to display data. In general, some pages need to load the current user object in several places-the controller needs to check the user's permissions, and the title bar must display the user name, A data component needs to know how users like to display data. So the performance problem arises: if the object can be cached in the memory and reused, it is much faster than querying the database repeatedly.

Although excellent frameworks such as MVC can help ease the problem of loading the same object in multiple places, a more common solution is to adopt the level-1 cache technology. Lightspeed always centers around the work unit mode, and its unitofwork type provides a level-1 cache. Applications that comply with the "batch request-1 work unit" mode have a level-1 cache of one page request range. Specifically, during a page request, if you use the ID as the condition to query data (including the query required for access to the associated objects with delayed loading ), if a work unit already contains an object corresponding to this ID, Lightspeed will bypass the database query operation and directly return an existing object. No faster than this method!

Most large ORM frameworks include similar features-for example, the session object of nhib.pdf has a level-1 cache function. However, many micro ORM (lightweight ORM framework) do not provide a level-1 cache. They only focus on the loading efficiency of object objects. The large ORM framework not only tries to be as efficient as possible during query, but also first tries not to query the database without querying the database.

The primary cache is automatically controlled by the Orm. Our user object will be automatically reused during the work unit (a page request), and we do not need to write code interference.

Level 2 Cache

Assume that our order management system can process multiple currency types-orders can be placed in USD, euro, or yen. To properly display currency data, we need to use bytes to process currency information, such as the currency name (US dollar), encoding (USD), and symbol ($ ). Next, we can define the currency entity type currency to start exploring the Level 2 cache.

By pre-loading and first-level caching, you can achieve high performance, but there are other ways to improve performance. Because the scope of a level-1 cache is its corresponding work unit, and the life cycle of the work unit is within one page request, this causes the application to query the currency database table every time it processes the page that needs to access the currency information. The currency information is the benchmark data-they are almost always the same. We do not have to query the database every time we process page requests to obtain the latest data. A more efficient way is to query the database only once and cache the benchmark data, and then use the cached data for each page request.

The above ideas can be implemented using second-level cache. Lightspeed second-level cache has a longer life cycle than a single unitofwork. You can decide how long the second-level cache object can exist (you can set expiry to determine the cache duration ). Lightspeed includes two second-level cache implementation methods, one is to use the ASP. NET cache mechanism, and the other is to use a powerful open-source library memcached that can span several servers. Some other ORM frameworks also provide level-2 caching, but most ORM do not.

By allowing Lightspeed to cache currency object to the second-level cache, We can reuse currency object data to avoid the overhead of multiple database queries. To Cache the currency object to the second-level cache, you must specify a cache implementation mechanism in the configuration. Then, you only need to select the currency object and set its cached option to true.

(Click an image to view the large image)

Compiled Query

We have explained how to improve the performance of the ORM through multiple examples above, but we have not considered it here, that is, the conversion between the C # LINQ expression and the SQL statement that finally queries the database requires additional overhead. This kind of overhead will occur in each LINQ query, but it is usually insignificant compared with the overhead of the database query. If you really want to dig out the last performance space on your server, you can reduce the overhead of the above conversion. You may want to write native SQL code directly, but in the latest ORM (including Lightspeed), you still have a way to reduce the conversion overhead while enjoying the convenience of LINQ.

Lightspeed eliminates the conversion overhead by using compiled queries. Compiled queries are constructed by the usual LINQ query statements: Lightspeed converts the LINQ query statements into a format that can be executed immediately and saves the format, in this way, the converted format can be used for each execution of the LINQ query, instead of converting each query. In this way, you do not have to write and maintain native SQL statements to improve performance.

In fact, in contrast to intuition, compiled queries have a higher performance than handwritten SQL code. This is because when Lightspeed executes a handwritten SQL code, it cannot infer what the result set is like. On the contrary, when Lightspeed executes a compiled query, it can infer the form of the result set (because the SQL is built by it ), it can be optimized when data is queried to populate object objects.

Compiling a LINQ query may be more annoying than the technology we discussed earlier (some ORM developers are studying how to make this process more convenient ). The reason is that you must store the compiled query and reuse it. It must be parameterized and dynamic parameters must be specified using relevant APIs during compilation and execution.

Let's take a look at the query for obtaining customer orders:

int id = /* get the customer ID from somewhere */; var customerOrders = unitOfWork.Orders.Where(o => o.CustomerId == id);

If this LINQ query is executed multiple times and we want to achieve the highest performance possible, we can compile it using the compile () Extension Method. In addition, we also need to replace the local variable ID of the preceding statement with the parameter form that can dynamically specify the value when executing the compiled query. The code for compiling a LINQ query is as follows:

var customerOrdersQuery = unitOfWork.Orders.Where(o => o.CustomerId == CompiledQuery.Parameter
 
  ("id")).Compile();
 

You can see that we replace the local variable id with the compiledquery. parameter ("ID") extension, and then call the compile () Extension Method. The result is a compiledquery object, which is usually stored as a persistent object or a member of a static class. Now we can execute the compiledquery query as follows:

int id = /* get customer ID from somewhere */ var results = customerOrdersQuery.Execute(unitOfWork, CompiledQuery.Parameters(new { id }));

(If you have made up your mind, you can tune the parameter value parsing process to limit the query performance. Please refer to this Article .)

Conclusion

Many developers think that object relationship ing technology is convenient for programming at the cost of performance. However, the modern ORM framework encapsulates pre-loading, batch update, and other technologies. These features are complicated if implemented through the handwritten data access layer. This performance technology shows that the performance of the ORM code can be as efficient as that of the handwritten data access layer code, and you no longer have to manage and maintain the complex SQL code and ing code. Using the ORM framework, you only need to change the value of one bit or modify the ing file to solve the n + 1 performance problem, this is easier than modifying the SQL code in the form of nesting and modifying the ing code to process multiple result sets.

Not all ORM frameworks provide all the features discussed in this article, but most modern ORM frameworks support more or less some of these features. Find out the database-related performance bottlenecks in the application, and then use the ORM framework that supports the features discussed in this article, you can solve the performance bottlenecks of most databases, you will be able to greatly improve the application performance with a small amount of effort.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.