Analysis on High memory usage of Asp.net core applications on Kubernetes, corekubernetes

Source: Internet
Author: User

Analysis on High memory usage of Asp.net core applications on Kubernetes, corekubernetes

Original article: https://blog.markvincze.com/troubleshooting-high-memory-usage-with-asp-net-core-on-kubernetes/

Ps: I have not translated the original text in a rigid manner. I should try my best to make it a little more popular. If you have any questions, please kindly advise. Thank you.

In the production environment, we deployed the asp.net core api application on Google Cloud (GCE-Google Container Engine) through Kubernetes ). We found that the memory usage of most components (core applications) is unreasonable. We set the application memory limit to 500 MB, and found that many api application instances are continuously restarted by Kubernetes because they exceed the memory limit (-- restart should be set for docker ).

The following two figures show two of our APIs. When Kubernetes restarts them, you will find that they keep increasing and reach the memory limit point.

Kubernetes cpu limit

We spent a lot of time investigating this issue. During this period, we tried to capture dumps for analysis, but we did not find the problem.

We also try to use multiple methods to reproduce this problem in our development environment:

  • In dev configuration in
  • On Windows with a production build
  • On Ubuntu with a production build
  • In Docker, using the actual production image

However, kubernetes cpu limit 0 in the above environment, none of them exceed MB of memory usage, and they all stop when they grow to around-MB.

During this period, we increased the memory limit from 1000 mb to reduce the container's frequent restart due to exceeding the maximum memory limit. What is interesting afterwards is that the memory usage has become as shown in:

Cpu limit kubernetes

After testing, we found that the memory usage did not increase without limit, kubernetes cpu memory limits, but it was also about mb. This number is almost consistent after the restart of different container instances and instances.

This phenomenon clearly indicates that the application in our container does not leak memory, but a piece of memory is allocated and not released. So I began to focus on "How the. net program running in Kubernetes limits the memory".

In fact, Kubernetes eventually runs the program in the docker container, and the docker container can use the docker run -- memory parameter to limit the memory usage. Therefore, I suspect that Kubernetes does not pass any parameters related to memory restrictions to the docker container instance. Therefore, the. net program naturally considers that there are many available memory available on the current machine.

However, this is not the case. We find that the opposite content (because the author suspect that Kubernetes does not pass memory-related parameters) is in the documentation.

Cpu requests and limits kubernetes

Thespec.containers[].resources.limits.memoryIs converted to an integer, and used as the value of--memoryFlag in the docker run command.

(This sentence indicates that KubernetesSpec. containers []. resources. limits. memory automatically follows the integer set by the -- memory parameter in docker run.)

This seems to have reached another dead end. I also tried to run the api program in docker on my computer and passed multiple memory limits through the -- memory parameter, but 1. I cannot reproduce the above MB memory usage scenarios. The memory is kept at around mb. 2. it is not observed that the container instance runs beyond its memory limit, even if I use the -- memory parameter to specify a value smaller than MB, this container instance can still run well under this smaller memory limit value.

Kubernetes cpu limits vs requests

I also mentioned on github that an issue about memory leakage is associated with Kestrel (a new libuv-based core server, tim Seaward sent an interesting suggestion about checking the number of CPUs printed by my application in different environments, because the cpu is a huge factor affecting memory usage.

I tried to print the following number in different environments using Environment. ProcessorCount in the Code:

  • On my machine, just doing dotnet run, the value was4.
  • On my machine, with Docker, it was1.
  • On Google Cloud Kubernetes it was8.

This will finally give me an explanation, because the number of CPUs will really affect the amount of memory used. The more cpu cores the more memory is used (for the author, he does not know the gc type, the number of cpu cores, and. net program memory size, although this post Link contains GC information ).

The final suggestion is to switch the GC mode from the Server GC (Server mode) to the Workstation GC (mode), so as to achieve the optimization effect of low memory usage. You only need to do the following in the csproj project file:

<PropertyGroup> <ServerGarbageCollection> false </ServerGarbageCollection> </PropertyGroup>

After this change, I re-release my api and the result is shown in the country (Blue Line ):


The workstation gc mode makes memory usage more conservative for applications, and the memory usage is reduced from about 100 MB to MB. Assuming that the workstation mode achieves the "MB to MB effect" at the expense of some performance and throughput (according to the official server mode, the workstation mode is relatively better than the workstation mode in some scenarios ), but so far I have not found any attenuation of api speed and throughput, although this api is not a performance-demanding requirement.

 

This story concludes that OS, available memory, and number of cpu cores are key factors for locating memory problems, because they will greatly affect the idea of. net GC. If you get stuck with the problem, do not hesitate to throw it out, and there are many excellent people in many. net communities who will be very helpful.

========================================================== ================================================== ======

This is also the problem for our company's projects. It was a long time ago. Why? Because the memory usage was not "Too obvious" in windows, GC played a decisive role, however, in the core environment, in addition to docker + linux, the problem was suspected to be a docker problem at the time. At that time, the problem was not analyzed as a foreign friend. I learned the following through this question:

1. I learned how to locate and think about the problem from the issue of k8s-> multi-environment issue-> github issue-> by myself, and finally solved the problem.

2. GC knowledge point supplement: this post

3. In addition, many knowledge points are hidden in suggestion (I hope you can take a closer look)

4. In my previous official document, I mentioned GC configuration in the project configuration file section.


At the beginning (two types of GC are unknown), normal people see this true refers to the GC garbage collection that activates the application, however, I did not notice the results of the investigation by foreigners (true actually refers to the activation of the server GC mode, and false does not refer to not GC, but to the use of the workstation GC mode ), can I tell you whether Microsoft can point out the real difference between true and false (in fact, we are ourselves. net research is not thorough enough, hahaha), so there will be no such as me, like this foreigner, for running in the docker container core application memory usage is too high, said "question ".

Ps: Our production has been changed to false, of course, true is no problem, but the server memory is "Eat not pull.

 

In the end, I hope you will support more core and make an advertisement for Zhang Da: search for "opendotnet" for public accounts and share your knowledge.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.