How to Improve Ruby On Rails Performance
1 Introduction to Introduction
It is always said that Rails is so slow that it has become a common issue in the Ruby and Rails communities. However, this statement is not true. As long as you use Rails correctly, it is not difficult to increase the running speed of your application by 10 times. So how can we optimize your application? Let's take a look at the following content.
1.1 how to optimize a Rails app
This slows down your Rails application for the following two reasons:
- Ruby and Rails should not be used as the preferred choice. (I used Ruby and Rails to do something I was not good)
- Excessive memory consumption leads to a large amount of time for garbage collection.
Rails is a pleasant framework, and Ruby is also a concise and elegant language. However, if it is abused, performance will be greatly affected. There are a lot of jobs that are not suitable for Ruby and Rails. You 'd better use other tools. For example, the database has obvious advantages in big data processing, and the R language is especially suitable for statistics-related work.
Memory problems are the primary cause of Slow Ruby applications. The 80-20 rule for Rails performance optimization is like this: the 80% increase is due to the memory optimization, and the remaining 20% is another factor. Why is memory consumption so important? Because the more memory you allocate, the more work Ruby GC (Ruby's garbage collection mechanism) needs to do. Rails occupies a large amount of memory, and each application occupies nearly 100 MB of memory after it is started on average. If you do not pay attention to memory control, it is very likely that your program memory will grow by more than 1 GB. So much memory needs to be recycled, it's no wonder that most of the execution time is occupied by GC.
2. How can we make a Rails application run faster?
There are three ways to make your application faster: resizing, caching, and code optimization.
Resizing is now easy to implement. Heroku basically does this for you, while Hirefire makes this process more automated. You can learn more about automatic resizing here. Other hosting environments provide similar solutions. In short, you can use it. However, please note that resizing is not a silver bullet that improves performance. If your application only needs to respond to a request within five minutes, resizing is useless. In addition, using Heroku + Hirefire can easily lead to overdraft of your bank account. I have seen Hirefire scale up an application to 36 entities, and I paid $3100 for it. I immediately removed two instances and optimized the code.
Rails cache is also easy to implement. The block cache in Rails 4 is very good. Rails documents are excellent materials on Cache knowledge. There is also a Cheyne Wallace article about Rails performance that is worth reading. Now it is easy to set Memcached. However, compared with resizing, caching cannot be the ultimate solution to performance problems. If your code cannot run properly, you will find that you will spend more and more resources on the cache until the cache can no longer speed up.
The only reliable way to make your Rails application faster is code optimization. In Rails, this is memory optimization. Naturally, if you accept my advice and avoid using Rails outside of its design capabilities, you will have less code to optimize.
2.1 avoid memory-intensive Rails features
Some features of Rails consume a lot of memory, resulting in additional garbage collection. The list is as follows.
2.1.1 serialization Program
A serialization program is a practical method for reading strings from a database as Ruby data types.
Class Smth <ActiveRecord: Base
Serialize: data, JSON
End
Smth. find (...). data
Smth. find (...). data = {...}
But convenience comes with 3x memory overhead. If you store 100 M in data column, please CT to allocate 300 M just to read it from the database.
It consumes more memory for effective serialization. You can see it by yourself:
Class Smth <ActiveRecord: Base
Def data
JSON. parse (read_attribute (: data ))
End
Def data = (value)
Write_attribute (: data, value. to_json)
End
End
This only requires two times of memory overhead. Some people, including myself, see the Rails JSON serialization program memory leakage, about 10% of the data volume per request. I don't understand why. I do not know whether there is a replicable situation. If you have experience or know how to reduce the memory, please let me know.
2.1.2 activity records
It is easy to manipulate data with ActiveRecord. However, ActiveRecord essentially encapsulates your data. If you have 1 GB of table data, ActiveRecord indicates that it will take 2 GB, in some cases more. Yes. In 90% of the cases, you get extra convenience. But sometimes you don't need it. For example, batch update can reduce the overhead of ActiveRecord. The following code does not instantiate any model or run verification and callback.
Book. where ('title LIKE? ',' % Rails % '). update_all (author: 'David ')
In the following scenario, it only executes SQL update statements.
Update books
Set author = 'David'
Where title LIKE '% Rails %'
Another example is iteration over a large dataset. Sometimes you need only the data. No typecasting, no updates. This snippet just runs the query and avoids ActiveRecord altogether:
Result = ActiveRecord: Base.exe cute 'select * from book'
Result. each do | row |
# Do something with row. values_at ('col1', 'col2 ')
End
2.1.3 string callback
The Rails callback is like saving before/after, the action before/after, and a lot of use. However, this method you write may affect your performance. There are three methods you can write, such as: callback before saving:
Before_save: update_status <br> before_save do | model |
Model. update_status <br> end <br> before_save "self. update_status"
The first two methods can run well, but the third method cannot. Why? Because the execution of the Rails callback requires that the execution context (variables, constants, global instances, and so on) be stored during the callback. If your application is large, you will eventually replicate a large amount of data in the memory. The callback can be executed at any time and cannot be recycled until your program ends.
It is symbolic that callback saves 0.6 seconds for each request.
2.2 write less Ruby
This is my favorite step. My college computer science professor likes to say that the best code doesn't exist. Sometimes other tools are required to complete the tasks at hand. Databases are the most commonly used. Why? Because Ruby is not good at processing large datasets. Very, very bad. Remember, Ruby occupies a very large amount of memory. For example, you may need 3 GB or more memory to process 1 GB of data. It takes dozens of seconds to recycle the three GB. A good database can process the data in one second. Let me give you some examples.
2.2.1 property pre-loading
Sometimes the attributes of the denormalization model are obtained from another database. For example, imagine that we are building a TODO list, including tasks. Each task can have one or more tags. The canonicalized data model is as follows:
Tasks
Id
Name
Tags
Id
Name
Tasks_Tags
Tag_id
Task_id
Load tasks and their Rails labels. You will do this:
Tasks = Task. find (: all,: include =>: tags)
& Gt; 0.058 sec
There is a problem with this Code. It creates an object for each tag, which consumes a lot of memory. Select a solution to pre-load tags in the database.
Tasks = Task. select <-END
*,
Array (
Select tags. name from tags inner join tasks_tags on (tags. id = tasks_tags.tag_id)
Where tasks_tags.task_id = tasks. id
) As tag_names
END
& Gt; 0.018 sec
This only requires an additional column of memory storage with an array tag. No wonder it is three times faster.
2.2.2 data set
Any code in the data set that I am talking about is used to summarize or analyze the data. These operations can be simply summarized, or more complex. Take group ranking as an example. Suppose we have an employee, department, and salary dataset. We want to calculate the employee's salary ranking in a department.
SELECT * FROM empsalary;
Depname | empno | salary
----------- + -------
Develop | 6 | 6000
Develop | 7 | 4500
Develop | 5 | 4200
Personnel | 2 | 3900
Personnel | 4 | 3500
Sales | 1 | 5000
Sales | 3 | 4800
You can use Ruby to calculate the ranking:
Salaries = Empsalary. all
Salaries. sort_by! {| S | [s. depname, s. salary]}
Key, counter = nil, nil
Salaries. each do | s |
If s. depname! = Key
Key, counter = s. depname, 0
End
Counter + = 1
S. rank = counter
End
The 100 K Data Program in the Empsalary table is completed within 4.02 seconds. Replace ipvs query, and use the window function to do the same work more than 4 times in 1.1 seconds.
SELECT depname, empno, salary, rank ()
OVER (partition by depname order by salary DESC)
FROM empsalary;
Depname | empno | salary | rank
----------- + ------- + -------- + ------
Develop | 6 | 6000 | 1
Develop | 7 | 4500 | 2
Develop | 5 | 4200 | 3
Personnel | 2 | 3900 | 1
Personnel | 4 | 3500 | 2
Sales | 1 | 1 | 5000 | 1
Sales | 3 | 4800 | 2
4 times of acceleration is impressive, and sometimes you get more, to 20 times. From my own experience, let's take an example. I have a three-dimensional OLAP multi-dimensional dataset with 600 k data rows. My program is sliced and aggregated. In Ruby, it takes about 90 seconds to complete 1 GB of memory. The equivalent SQL query is completed within 5.
2.3 Unicorn Optimization
If you are using Unicorn, the following optimization techniques will apply. Unicorn is the fastest web server in the Rails framework. But you can still make it run faster.
2.3.1 pre-load App
Unicorn can pre-load the Rails application before creating a new worker process. There are two advantages. First, the main thread can share memory data through the user-friendly GC mechanism (Ruby 2.0 or above) that is replicated during writing. The operating system transparently copies the data to prevent modification by the worker. Second, pre-loading reduces the start time of the worker process. Rails worker Process restart is very common (will be further elaborated later), so the faster the worker restart speed, we can get better performance.
To enable application pre-loading, you only need to add a line in the unicorn configuration file:
Preload_app true
2.3.2 GC between Request requests
Please note that the GC processing time accounts for up to 50% of the application time. This is not the only problem. GC is usually unpredictable and will be triggered when you don't want it to run. So what do you do?
First, we will think about what will happen if GC is completely disabled? This seems a bad idea. Your application may soon be full of 1 GB of memory, and you have not found it in time. If your server also runs several workers at the same time, your application will soon experience insufficient memory, even if your application is on a self-hosted server. Not to mention the Heroku with a memory limit of 512 MB.
In fact, we have a better way. If GC cannot be avoided, we can try to determine the GC running time and run it in idle time. For example, run GC between two requests. This can be easily achieved by configuring Unicorn.
For versions earlier than Ruby 2.1, there is a unicorn module called OobGC:
Require 'unicorn/oob_gc'
Use (Unicorn: OobGC, 1) # "1" indicates "Force GC to run after 1 request"
For Ruby 2.1 and later versions, it is best to use gctools (https://github.com/tmm1/gctools ):
Require 'gctools/oobgc'
Use (GC: OOB: UnicornMiddleware)
However, there are some precautions for running GC between requests. Most importantly, this optimization technology is perceptible. That is to say, the user will obviously feel the performance improvement. But the server needs to do more work. Different from running GC only when necessary, this technology requires frequent running of GC on the server. therefore, make sure that your server has enough resources to run GC, and there are enough workers to process user requests while other workers are running GC.
2.4 limited growth
I have shown you some examples of applications occupying 1 GB of memory. If your memory is sufficient, it is not a big problem to occupy such a large block of memory. However, Ruby may not return this memory to the operating system. Next, let me explain why.
Ruby allocates memory through two heaps. All Ruby objects are stored in Ruby's own heap. Each object occupies 40 bytes (in 64-bit operating systems ). When an object requires more memory, it allocates memory in the heap of the operating system. After the object is recycled and released, the heap memory in the occupied operating system will be returned to the operating system, however, the memory occupied by Ruby's own heap is simply marked as free and not returned to the operating system.
This means that Ruby's heap will only increase and will not decrease. Imagine that if you read 1 million rows of records from the database, there are 10 columns in each row. Then you need to allocate at least 10 million objects to store the data. Generally, Ruby worker occupies MB of memory after startup. To adapt to so much data, the worker requires an additional MB of memory (10 million objects, each occupying 40 bytes ). Even if these objects are finally reclaimed, the worker still uses MB of memory.
We need to declare that Ruby GC can reduce the size of the heap. However, I have not provided this function in practice. In the production environment, the conditions that trigger the heap reduction rarely appear.
If your worker can only grow, the most obvious solution is to restart the worker whenever it occupies too much memory. Some managed services do this, such as Heroku. Let's take a look at other methods to implement this function.
2.4.1 internal memory control
Trust in God, but lock your car to believe in God, but don't forget to lock the car. (Implication: Most foreigners have religious beliefs and believe that God is omnipotent. But in daily life, who can expect God to help themselves. Faith is faith, but you have to rely on yourself when you have difficulties .). There are two ways for your application to implement self-memory restrictions. I care about what they do, Kind (friendly) and hard (forced ).
Kind-friendly memory limit is the size of the forced memory after each request. If the worker occupies a large amount of memory, the worker ends and unicorn creates a new worker. That's why I care about it as "kind ". It will not interrupt your application.
Obtain the memory size of the process. Use RSS to measure the memory size of the Process in Linux and MacOS or OS gem on windows. Let me show you how to implement this restriction in the Unicorn configuration file:
Class Unicorn: HttpServer
KIND_MEMORY_LIMIT_RSS = 150 # MB
Alias process_client_orig process_client
Undef_method: process_client
Def process_client (client)
Process_client_orig (client)
Rss = 'ps-o rss =-p # {Process. pid} '. chomp. to_ I/1024
Exit if rss> KIND_MEMORY_LIMIT_RSS
End
End
The hard disk memory limit is to ask the operating system to kill your working process if it grows a lot. On Unix, you can call setrlimit to set the RSSx limit. As far as I know, this is only valid on Linux. MacOS implementation is broken. I will be grateful for any new information.
This clip comes from the configuration file restricted by the Unicorn Hard Disk:
After_fork do | server, worker |
Worker. set_memory_limits
End
Class Unicorn: Worker
HARD_MEMORY_LIMIT_RSS = 600 # MB
Def set_memory_limits
Process. setrlimit (Process: RLIMIT_AS, HARD_MEMORY_LIMIT * 1024*1024)
End
End
2.4.2 external memory control
Automatic control does not save you from occasional OMM (insufficient memory. Usually you should set some external tools. On Heroku, there is no need because they have their own monitoring. However, if you are using self-managed systems, using monit, god is a good idea, or other monitoring solutions.
2.5 Ruby GC Optimization
In some cases, you can adjust the Ruby GC to improve its performance. I would like to say that these GC optimizations become less and less important. The default settings of Ruby 2.1 have become advantageous to most people.
For GC tuning, you need to know how it works. This is an independent topic and does not belong to this topic. To learn more, read Sam Saffron's secret Ruby GC article. In my upcoming book on Ruby performance, I have dug deeper Ruby GC details. Subscribe to this. When I complete the beta version of this book, I will send you an email.
My advice is not to change GC settings unless you know exactly what you want and have enough theoretical knowledge about how to improve performance. This is especially important for users who use Ruby 2.1 or later versions.
I know that GC optimization can improve performance in only one scenario. That is, when you need to load a large amount of data at a time. You can change the following environment variables to reduce the GC running frequency: Executor, RUBY_GC_MALLOC_LIMIT, RUBY_GC_MALLOC_LIMIT_MAX, RUBY_GC_OLDMALLOC_LIMIT, and RUBY_GC_OLDMALLOC_LIMIT.
Note that these variables are only applicable to Ruby 2.1 and later versions. For versions earlier than 2.1, a variable may be missing or the variable name is not used.
The default value of RUBY_GC_HEAP_GROWTH_FACTOR is 1.8. It is used to increase the memory size each time Ruby heap does not have enough space to allocate memory. When you need to use a large number of objects, you want the heap memory to grow faster. In this case, you need to increase the size of this factor.
Memory limit is used to define the frequency at which GC is triggered when you need to apply for space from the heap of the operating system. For Ruby 2.1 and later versions, the default quota is:
New generation malloc limit RUBY_GC_MALLOC_LIMIT 16 M
Maximum new generation malloc limit RUBY_GC_MALLOC_LIMIT_MAX 32 M
Old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT 16 M
Maximum old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT_MAX 128 M
Let me briefly explain the meaning of these values. By setting the preceding values, each time a new object is allocated between 16 M and 32 M, and each time the old object occupies between 16 M and 128 M, this object is called at least once by garbage collection), and Ruby will run GC. Ruby dynamically adjusts the current quota value based on your memory mode.
Therefore, when you only have a few objects, but occupy a large amount of memory (for example, reading a large file to a string object), you can increase the limit, to reduce the frequency of GC triggering. Remember to add four quota values at the same time, preferably a multiple of the default values.
My suggestions may be different from those of others. It may be suitable for me, but not for you. These articles describe which are applicable to Twitter and which are applicable to Discourse.
2.6 Profile
Sometimes, these suggestions may not be generic. You need to figure out your problem. In this case, you need to use profiler. Ruby-Prof is a tool used by every Ruby user.
Want to know