Some performance issues when using Entity Framework

Source: Internet
Author: User
Tags what sql


Since I used EF, I have been very concerned about whether there are potential performance problems every time. So every time I write a LINQ query, I will use SQL profiler to check the actually generated SQL statements to find potential performance problems. We strongly recommend that you do this to avoid problems with the software in the future.

1. Select only one or more columns

Sometimes, writing LINQ in C # Looks comfortable, but the performance is not necessarily good, so it is necessary to make some adjustments. For example:

I need to know the number of clicks in an article. I may write:

context.Post.FirstOrDefault(p => p.Id == postId).Hits;

Or:

context.Post.Find(postId).Hits;

I expect that they will only filter the data in the hits column in the database. However, through SQL profiler, we will find that these two statements actually gave all the columns to the SELECT statement, the access to hits is actually performed in the memory.

Although small tables cannot see performance problems, in case a column in your table stores file byte streams, such operations may be slow and consume additional network transmission, therefore, we cannot ignore this issue.

In fact, I only need to make a slight adjustment to avoid this problem, but the LINQ statement is ugly:

context.Post.Where(p => p.Id == postId).Select(p => p.Hits).FirstOrDefault();

The final native SQL generated by LINQ to SQL is as follows:

exec sp_executesql N‘SELECT TOP (1) [Extent1].[Hits] AS [Hits]FROM [dbo].[Post] AS [Extent1]WHERE [Extent1].[Id] = @p__linq__0‘,N‘@p__linq__0 uniqueidentifier‘,@p__linq__0=‘850C3A86-6C3D-408B-8099-61EDA559F804‘

Only one hits field is selected.

Ii. tolist () Problems

In fact, many EF performance problems are related to the query execution time. Our general intention is to first create a query expression, instead of execute, instead of build. Execution is performed only when the expression result is used.

When I was in the company code program, I saw many colleagues using EF. After I wrote the query, I liked to directly call the tolist () method. Sometimes this will cause a lot of performance problems. Because the declaration of a LINQ expression does not immediately execute the SQL query, however, once tolist () is added to the end, it will be executed immediately. If you only want to select some of the data based on the conditions, the tolist () will be filtered later, that is, it will be executed from the memory, it does not convert your condition into the SQL where statement for execution.

VaR query = from... // create a query, but do not execute... var result = query. tolist (); // execute the query immediately

Therefore, you should try to avoid looking for the desired element from the result of tolist.

Iii. iqueryable and ienumerable

I prefer iqueryable in the selection of these two interfaces. Most of the time, these two interfaces are consistent in usage, but if you want to perform an uncertain query, it means that the query expression is not one-time determined, for its results, it may be decided by other classes to select the items. In this case, iqueryable is used.

For example, I have a data layer method:

public IEnumerable<EdiBlog.Core.Entities.Post> GetAllPost(){    return context.Post;}

Obviously, it will be called by other methods in the system, and these callers expect different results. The common operation is to merge a where statement:

var myResult = postDa.GetAllPost().Where(...)

However, unfortunately, the conditions in the where statement are not converted to native SQL and are filtered in the memory. This is a relatively negative performance issue. So at the beginning of this article, I suggest you use SQL profiler to see how your own LINQ is executed.

If you change the return type to iqueryable, your where statement can be converted to SQL Execution.

public IQueryable<EdiBlog.Core.Entities.Post> GetAllPost(){    return context.Post;}

There is a good post on the two interfaces on stackoverflow at the diaosi Forum. You can take a look at them by yourself:

Http://stackoverflow.com/questions/252785/what-is-the-difference-between-iqueryablet-and-ienumerablet

"Ienumerable: ienumerable is best suitable for working with in-memory collection. ienumerable doesn't move between items, it is forward only collection.

Iqueryable: iqueryable best suits for remote data source, like a database or web service. iqueryable is a very powerful feature that enables a variety of interesting deferred execution scenarios (like paging and composition based queries )."

There is also an intuitive answer on the msdn Forum:

Iqueryable returns a "queryable" that is a query you cocould still be enriched before really sending it to the server.

Ienumerable returns a list that is the actual querying took place and you get the results. tolist is isued to force running the query and returning these enumerable results...

So in short:
-Use iqueryable if you want to return a base query that cocould be further enhanced before running it server side (by enumerating its items )..
-Use ienumerable/tolist if you want to return a list that has been retrieved from the DB

4. Number of computations, count () and count

This is the most vulnerable and a very serious performance issue. When we need to count the number of records that meet a certain condition, we want the SQL statement to be in the form of select count. However, the following seemingly natural statement will lead to undesirable results:

context.Category.FirstOrDefault(p => p.Name == categoryName).Posts.Count;

This is the statement used in my blog to count the number of articles under a certain category. Of course, this is not the case because of performance problems. The SQL statement is not select count, but divided into two. The following is what SQL profiler caught:

exec sp_executesql N‘SELECT TOP (1) [Extent1].[Id] AS [Id], [Extent1].[Name] AS [Name], [Extent1].[DisplayName] AS [DisplayName]FROM [dbo].[Category] AS [Extent1]WHERE [Extent1].[Name] = @p__linq__0‘,N‘@p__linq__0 nvarchar(4000)‘,@p__linq__0=N‘ASPNET‘exec sp_executesql N‘SELECT [Extent2].[Id] AS [Id], [Extent2].[Title] AS [Title], [Extent2].[Slug] AS [Slug], [Extent2].[PubDate] AS [PubDate], [Extent2].[PostContent] AS [PostContent], [Extent2].[Author] AS [Author], [Extent2].[CommentEnabled] AS [CommentEnabled], [Extent2].[IsPublished] AS [IsPublished], [Extent2].[Hits] AS [Hits], [Extent2].[Rators] AS [Rators], [Extent2].[Rating] AS [Rating], [Extent2].[ExposedToSiteMap] AS [ExposedToSiteMap], [Extent2].[DisplayFrom] AS [DisplayFrom], [Extent2].[DisplayTill] AS [DisplayTill], [Extent2].[LastModifyOn] AS [LastModifyOn], [Extent2].[PublishToRss] AS [PublishToRss]FROM  [dbo].[PostCategory] AS [Extent1]INNER JOIN [dbo].[Post] AS [Extent2] ON [Extent1].[PostId] = [Extent2].[Id]WHERE [Extent1].[CategoryId] = @EntityKeyValue1‘,N‘@EntityKeyValue1 uniqueidentifier‘,@EntityKeyValue1=‘3FEB11A2-6E36-4DCE-8C02-614BEF7ACC62‘

We can see that EF has done two things. The first thing is to find the category whose name is "ASPnet", and then use the ID of this category to find all its posts, the last count is actually. net in the memory. This clearly gives select all the information we don't need. We only need one count. Is it so complicated for Mao?

Review the first article I mentioned. It is not difficult to find out. Attributes accessed after firstordefault (...) are all in the memory. Therefore, when we access category. firstordefault (P => P. Name = categoryname), the first SQL statement is generated. Followed by ". Posts" is the navigation attribute of the category object. EF will use lazy load to load all the posts of this category, so the second SQL statement is generated. Then the count following it is executed in the memory.

If you want the code to generate the LINQ to SQL statement as much as possible, there is a very simple principle, that is, try to use the LINQ and lambda expressions so that EF can help us translate. There are two types of count in C. Enumerable. Count () is a method, and list. Count is an attribute. Once a thing becomes a list, you can count it again, and it must be in the memory.

Therefore, in EF, to perform the Count operation, write as follows:

context.Post.Count(p => p.Categories.Any(q => q.Name == categoryName));

In this case, count () accepts a Lambda expression, so that the "select count" clause can be accurately translated into "select count:

SELECT [GroupBy1].[A1]  AS [C1]FROM   (           SELECT COUNT(1)      AS [A1]           FROM   [dbo].[Post]  AS [Extent1]           WHERE  EXISTS (                      SELECT 1 AS [C1]                      FROM   [dbo].[PostCategory] AS [Extent2]                             INNER JOIN [dbo].[Category] AS [Extent3]                                  ON  [Extent3].[Id] = [Extent2].[CategoryId]                      WHERE  ([Extent1].[Id] = [Extent2].[PostId])                             AND ([Extent3].[Name] = ‘ASPNET‘)                  )       )                AS [GroupBy1]

The performance is much better now ~

Some performance issues when using Entity Framework

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.