Ruby languages are often praised for their flexibility. As Dick Sites says, you can "program for programming." Ruby on Rails extends the core Ruby language, but it is Ruby itself that makes this extension possible. Ruby on Rails uses the language's flexibility so that it is easy to write highly structured programs without too much boilerplate or extra code: You can get a lot of standard behavior without extra work. While this relaxed and free behavior is not always perfect, you can get a lot of good architectures without having to work too much.
For example, Ruby on Rails is based on model-view-controller (MODEL-VIEW-CONTROLLER,MVC) mode, which means that most Rails applications can be clearly divided into three parts. The model section contains the behavior required to manage application data. Typically, in a Ruby on Rails application, the relationship between the model and the database table is the 1:1;ruby on rails default object-relational mapping (ORM) ActiveRecord is responsible for managing the interaction between the model and the database, which means that Ruby on rails A sequence usually has a small amount of SQL code, if any. The second part is the view, which contains the code needed to create the output sent to the user, which is usually composed of HTML, JavaScript, and so on. The last part is the controller, which transforms the input from the user into the correct model, and then renders the response with the appropriate view.
Rails advocates are often happy to attribute their ease of use to the MVC paradigm-and other features of both Ruby and Rails, and say that few programmers can create more functionality in a short time. Of course, this means that the cost of putting into software development will generate more business value, so Ruby on Rails development is becoming more popular.
However, initial development costs are not the whole story, and there are other subsequent costs to consider, such as the maintenance costs and hardware costs of the application's operation. Ruby on Rails developers typically use tests and other agile development techniques to reduce maintenance costs, but this makes it easy to overlook the efficient operation of a rails application with large amounts of data. Although Rails can simplify access to the database, it is not always so effective.
Why are Rails applications running slowly?
There are several basic reasons why Rails applications are slow to run. The first reason is simple: Rails always makes assumptions to speed up your development. Generally, this assumption is correct and helpful. However, they are not always good for performance, and can also lead to inefficient use of resources-especially database resources.
For example, using an SQL statement that is equivalent to select *, ActiveRecord selects all the fields on the query by default. In the case of a large number of columns-especially when some fields are huge VARCHAR or BLOB fields-this behavior is problematic in terms of memory usage and performance.
Another notable challenge is the n+1 problem, which will be discussed in detail in this paper. This can result in the execution of many small queries rather than a single large query. For example, ActiveRecord does not know which of a set of parent records requests a child record, so it generates a child record query for each parent record. Because of the load per query, this behavior can cause significant performance problems.
Other challenges are more related to the development habits and attitudes of Ruby on Rails developers. Because ActiveRecord can make so many tasks easy, Rails developers often form an attitude of "SQL does not work", even when SQL is better suited to use. The speed of creating and processing a large number of ActiveRecord objects is very slow, so in some cases it is quicker to write a SQL query that does not need to instantiate any objects directly.
Because Ruby on Rails is often used to reduce the size of the development team, and because Ruby on rails developers typically perform some of the system administration tasks required to deploy and maintain applications in production, it is likely that there will be problems if little is known about the environment of the application. The operating system and database may not have been set correctly. For example, although not optimal, the MySQL my.cnf settings often retain their default settings within the Ruby on Rails deployment. In addition, there may be a lack of adequate monitoring and benchmarking tools to provide a long-term performance profile. Of course, this is not to blame Ruby on Rails developers; This is the consequence of the non specialization; in some cases, Rails developers may be experts in these two areas.
The final issue is that Ruby on Rails encourages developers to develop in a local environment. There are a few benefits to doing this-for example, the reduction of development latency and increased distribution-but it does not mean that you can handle only a limited set of data because of the decrease in workstation size. The difference between how they develop and where the code will be deployed can be a big problem. Even if you've been working on a small, lightweight local server for a long time, you'll find that the application has a significant performance problem with large data on congested servers.
Of course, there may be many reasons why Rails applications have performance problems. The best way to detect potential performance problems with Rails applications is to use diagnostic tools that provide you with repeatable, accurate metrics.
Detect performance problems
One of the best tools is the Rails development log, which is typically located in the Log/development.log file on each development machine. It has a comprehensive range of metrics: the total time it takes to respond to a request, the percentage of time spent in the database, the percentage of time it takes to generate a view, and so on. In addition, there are tools available to analyze this log file, such as Development-log-analyzer.
During production, a lot of valuable information can be found by looking at Mysql_slow_log. A more comprehensive description is beyond the scope of this article, and more information can be found in the Resources section.
One of the most powerful and useful tools is the Query_reviewer plug-in (see Resources). This plugin shows how many queries on the page will be executed and how long it takes to generate the page. It also automatically analyzes ActiveRecord generated SQL code to identify potential problems. For example, it can find a query that does not use the MySQL index, so if you forget to index an important column and cause a performance problem, you will be able to easily find this column (for more information on MySQL indexes, see Resources). This plugin displays all such information in a pop-up <div> (visible only in development mode).
Finally, don't forget to use tools like Firebug, YSlow, Ping, and tracert to detect whether a performance problem is from a network or a resource-loading problem.
Next, let's look at some specific Rails performance issues and their solutions.
n+1 query problem
The n+1 query problem is one of the biggest problems with Rails applications. For example, how many queries can the code in Listing 1 generate? This code is a simple loop that iterates through all the posts in an imaginary post table and displays the category of post and its main body.
Listing 1. Post.all code that is not optimized
<% @posts = Post.all (@posts). Each do |p|%>
Answer: The above code generates a query plus one query per line within the @posts. Because of the load per query, this can be a big challenge. The culprit is a call to P.category.name. This call applies only to that particular post object, not to the entire @posts array. Thankfully, by using immediate loading, we can fix this problem.
Loading now means that Rails will automatically execute the required query to load objects of any particular child object. Rails will use a JOIN SQL statement or a policy that executes multiple queries. However, assuming that all of the child objects that will be used are specified, then the case of n+1 would never be caused, and in the N+1 case, each iteration of a loop would generate an additional query. Listing 2 is a revision of the code in Listing 1 that uses immediate loading to avoid n+1 problems.
Listing 2. Post.all code optimized with immediate load
<% @posts = Post.find (: All,: Include=>[:category]
@posts. A Do |p|%>
The code generates up to two queries, regardless of how many rows are in this posts table.
Of course, not all cases are so simple. More work is required to handle complex n+1 queries. Is it worthwhile to do so much? Let's do some quick tests.
Test n+1
Using the script in Listing 3, you can see how slow-or how quickly the query can be reached. Listing 3 shows how to use ActiveRecord in a standalone script to establish a database connection, define a table, and load data. Then, you can use Ruby's built-in benchmark library to see which way is faster and how fast.
Listing 3. Load Datum test scripts now
Require ' rubygems ' require ' faker ' require ' active_record ' require ' benchmark ' # This call creates a connection to our D
Atabase. Activerecord::base.establish_connection (: Adapter => "MySQL": Host => "127.0.0.1": Username => "Root", # N Ote That's the the default setting for MySQL,:p assword => "", # a properly secured system would have a Diffe
Rent MySQL # username and password, and if so, your ' ll need to # change these settings. :d atabase => "test") # I, set up our database ... class Category < activerecord::base end unless CATEGORY.TABL
E_exists?
Activerecord::schema.define do create_table:categories do |t| T.column:name,: String End-end category.create (:name=> ' Sara campbell\ ' Stuff ') category.create (:name=> ' Jak E moran\ ' s possessions ') category.create (:name=> ' josh\ ' s Items ') number_of_categories = Category.count class Item ; ActiveRecord::Base Belongs_to:category End # If the table DOESN ' t exist, we ' ll create it.
Unless item.table_exists?
Activerecord::schema.define do create_table:items do |t|
T.column:name,: String t.column:category_id,: Integer end end puts "Loading data ..." item_count = Item.count Item_table_size = 10000 if Item_count < Item_table_size (Item_table_size-item_count). Times do item.create!
(: Name=>faker.name,:category_id=> (1+rand (number_of_categories.to_i))) End-end puts "Running tests ..."
BENCHMARK.BM do |x|
[100,1000,10000].each do |size|
X.report "Size:#{size}, with n+1 problem" does @items =item.find (: All,: Limit=>size) @items. Each do |i| I.category end X.report "Size:#{size}, With:include" Does @items =item.find (: All,: Include=>:category,: Limi
t=>size) @items. Each do |i|
I.category End End
This script uses: include clause to test the speed of looping 100, 1,000, and 10,000 objects with and without immediate load. In order to run this script, you might want to replace these database connection parameters at the top of this script with parameters that are appropriate for your local environment. In addition, you need to create a MySQL database named Test. Finally, you will also need the two gems ActiveRecord and Faker, which can be obtained by running the gem install ActiveRecord faker.
The results of running this script on my machine are generated as shown in Listing 4.
Listing 4. Benchmark test script output for immediate loading
--Create_table (: Categories)
-> 0.1327s
--create_table (: Items)
-> 0.1215s
Loading data ...
Running tests
... User System Total real
size:100, with n+1 problem 0.030000 0.000000 0.030000 (0.045996)
size:100, with:include 0.010000 0.000000 0.010000 (0.009164)
size:1000, with n+1 problem 0.260000 0.040000 0.300000 (0.346721)
size:1000, with:include 0.060000 0.010000 0.070000 (0.076739)
size:10000, with n+1 problem 3.110000 0.380000 3.490000 (3.935518)
size:10000, With:include 0.470000< c26/>0.080000 0.550000 (0.573861)
In all cases, use: Include Tests are always quicker-5.02, 4.52, and 6.86 times times respectively. Of course, the specific output depends on your particular situation, but immediate loading can lead to significant performance improvements.
Nested immediate load
What if you want to refer to a nested relationship-relationship relationship? Listing 5 shows a common scenario: looping through all the posts and displaying the author's image, where Author and image are belongs_to.
Listing 5. Nested immediate load case
@posts = Post.all
@posts. Each do |p|
This code has encountered the same n+1 problem as before, but the syntax of the fix is not so obvious because the relationship is used here. So how do you immediately load nested relationships?
The correct answer is to use: the hash syntax for the include clause. Listing 6 shows a nested immediate load with hash syntax.
Listing 6. Nested Immediate Load Solution
@posts = Post.find (: All,: include=>{: category=>[],
: author=>{: image=>[]})
@posts. A Do |p|
As you've seen, you can nest hashes and array of real quantities (literal). Note that the only difference between hashing and arrays in this example is that the hash can contain nested child entries, while arrays cannot. Otherwise, the two are equivalent.
Indirect immediate loading
Not all n+1 problems can be easily detected. For example, how many queries can be generated in Listing 7?
Listing 7. Indirect immediate load sample use case
<% @user = User.find (5)
@user. Posts.each do |p|%>
<%=render:p artial=> ' posts/summary ': locals= >:p ost=>p
%> <%end%>
Of course, deciding on the number of queries requires a posts/summary partial understanding. This partial is shown in Listing 8.
Listing 8. Indirect load Partial:posts/_summary.html.erb immediately
Unfortunately, the answer is that listing 7 and listing 8 generate an extra query for each line in the post, looking for the user's name-even if the Post object is automatically generated by ActiveRecord from a user object that is already in memory. In short, Rails does not associate child records with their parent records.
The fix method is to use the self reference for immediate loading. Basically, because Rails overloads the child records generated by the parent record, you need to load the parent records immediately, just as if the parent and child records are completely separate relationships. The code looks like listing 9.
Listing 9. Indirect immediate load solution
<% @user = User.find (5,: include=>{:p Osts=>[:user]})
... snip ...
Although counterintuitive, this technique is similar to how it works. However, it is easy to use this technique to do too much nesting, especially if the architecture is complex. Simple use cases are fine, as shown in Listing 9, but cumbersome nesting can also be problematic. In some cases, too much loading of Ruby objects may be slower than dealing with n+1 problems-especially when each object is not being taken over by the entire tree. In this case, other solutions to the n+1 problem may be more appropriate.
One way is to use caching techniques. The Rails V2.1 has built-in simple cache access. With Rails.cache.read, Rails.cache.write, and related methods, you can easily create your own simple caching mechanism, and the back end can be a simple memory backend, a file-based backend, or a distributed cache server. More information about Rails built-in cache support can be found in the Resources section. But you don't need to create your own caching solution; You can use a preset Rails plug-in, such as Nick Kallen's cache money plugin. This plugin provides Write-through caching and is based on the code used on Twitter. See Resources for more information.
Of course, not all Rails issues are related to the number of queries.
Rails Grouping and aggregation calculations
One problem you may encounter is that the work done in Ruby should have been done by the database. This tests the power of Ruby. It's hard to imagine people volunteering to recreate parts of their database code in C without any significant incentive, but it's easy to do a similar calculation of the ActiveRecord object group within Rails. However, Ruby is always slower than the database code. So do not perform calculations using pure Ruby, as shown in Listing 10.
Listing 10. Incorrect way to perform group calculations
All_ages = Person.find (: All). group_by (&:age). Keys.uniq
oldest_age = Person.find (: All). Max
Instead, Rails provides a series of grouping and aggregation functions. You can use them as shown in Listing 11.
Listing 11. The correct way to perform group calculations
All_ages = Person.find (: All,: Group=>[:age])
oldest_age = person.calcuate (: Max,: Age)
Activerecord::base#find has a number of options available to simulate SQL. More information can be found within the Rails documentation. Note that the Calculate method can be applied to any valid aggregate function supported by the database, such as: Min,: Sum, and: Avg. In addition, calculate can accept several arguments, such as: conditions. Consult the Rails documentation for more detailed information.
However, not everything you do in SQL can be done within Rails. If the plugin is not enough, you can use custom SQL.
customizing SQL with Rails
Suppose there is such a table, the person's occupation, age, and the number of accidents involving them in the past year. You can use a custom SQL statement to retrieve this information, as shown in Listing 12.
Listing 12. Examples of customizing SQL with ActiveRecord
sql = "Select profession,
avg (age) as Average_age,
avg (accident_count) from
persons
GROUP
by Profession "
person.find_by_sql (SQL). Each do |row|
Puts "#{row.profession}," <<
avg. Age: #{row.average_age}, "<<
" avg. Accidents: #{row.average_ Accident_count} "End
The script should be able to generate the results shown in Listing 13.
Listing 13. Customizing the output of SQL with ActiveRecord
Programmer, Avg. age:18.010, Avg. accidents:9
System Administrator, Avg. age:22.720, Avg. accidents:8
This is, of course, the simplest example. You can imagine how you can extend the SQL in this example to a somewhat complex SQL statement. You can also use the Activerecord::base.connection.execute method to run other types of SQL statements, such as ALTER TABLE statements, as shown in Listing 14.
Listing 14. Customizing non-lookup SQL with ActiveRecord
Activerecord::base.connection.execute "ALTER TABLE some_table change COLUMN ..."
Most mode operations, such as adding and removing columns, can be done using Rails's built-in methods. But you can also use the ability to execute arbitrary SQL code if you want.
Conclusion
As with all frameworks, Ruby on Rails suffers performance problems without extra care and attention. Fortunately, the technology to monitor and fix these problems is relatively simple and easy to learn, and even complex problems can be solved with patience and understanding of the source of performance problems.