How to optimize ActiveRecord in Ruby on Rails: railsactiverecord

Last Update:2015-04-24 Source: Internet

Author: User

Tags encode string ruby on rails

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Method for optimizing ActiveRecord in Ruby on Rails, railsactiverecord

Ruby on Rails programming often spoils you. This evolving framework will free you from the dullness of other frameworks. You can express your intent with a few lines of code that you are accustomed to. You can also use ActiveRecord.

For an old Java? Programmer like me, ActiveRecord is a bit rusty. With the Java framework, I usually build a mapping between separate models and patterns. A framework like this is a mapping framework. With ActiveRecord, I only define the database schema: either in SQL or in a Ruby class called migration. Those frameworks that base the object model design on the database structure are called packaging frameworks. Unlike most packaging frameworks, Rails can discover the characteristics of the object model by querying database tables. Unlike building complex queries, I use models to traverse relationships in Ruby, not SQL. In this way, I have both the simplicity of the packaging framework and most of the functionality of the mapping framework. ActiveRecord is easy to use and extend. Sometimes it's even too simple.

Like any database framework, ActiveRecord makes it easy for me to do a lot of trouble. I can get too many columns and it's easy to miss important structured database features such as indexes or null constraints. I'm not saying ActiveRecord is a bad framework. It's just that if you need to scale, you need to know how to solidify your application. In this article, I will walk you through some of the important optimizations you might need when using Rails, a unique persistence framework.
Basic management

Generating a model supported by a pattern is extremely easy, requiring very little code, script / generate model model_name. As you know, this command can generate models, migrations, unit tests, and even a default fixture. Fill in some data columns in the migration, enter some test data, write a few tests, add a few verifications, and you're done, it's really tempting. But think twice. You should consider the overall database design, paying special attention to these things:

Rails won't let you get rid of basic database performance issues. Databases need information, which is often indexed for good performance.
Rails won't let you get rid of data integrity issues. Although most Rails developers don't like keeping restrictions in the database, you should consider things like empty columns.
Rails provides convenient default attributes for many elements. Sometimes default properties like the length of a text field are too large for most practical applications.
Rails does not force you to create an effective database design.
Before you continue on your journey and learn more about ActiveRecord, you should first make sure you have a good foundation. Make sure that the index structure is available to you. If the given table is large, if you will search on columns instead of ids, and if indexes can help you (see the database manager documentation for more details-different databases use indexes differently) Then you need to create an index. There is no need to create indexes in SQL-they can simply be created using migrations. You can easily create indexes using create_table migrations, or you can create an additional migration to create indexes. Here is an example migration that you can use to create an index for ChangingThePresent.org (see Related topics):
Listing 1. Creating indexes during migration

class AddIndexesToUsers <ActiveRecord :: Migration
def self.up
add_index: members,: login
add_index: members,: email
add_index: members,: first_name
add_index: members,: last_name
end

def self.down
remove_index: members,: login
remove_index: members,: email
remove_index: members,: first_name
remove_index: members,: last_name
end
end

ActiveRecord takes care of the index on the id, and I have explicitly added indexes that can be used in various searches because the table is very large, is not frequently updated but is often searched. Usually, we wait until we have a certain grasp of the problem in a given query before taking the corresponding action. This strategy saves us from having to guess the database engine twice. But from the perspective of users, we know that the table will soon have millions of users, and if there are no indexes on frequently searched columns, the table will be very inefficient.

Two other common issues are also related to migration. If neither strings nor columns should be empty, then make sure the migration is written correctly. Most DBAs (database administrators) will assume that Rails provides wrong default attributes for empty columns: If you want to create a column that cannot be empty, you must explicitly add the parameter: null => false. If you have a string column, be sure to write the limits for your application. By default, Rails migrations encode string columns as varchar (255). Usually, this value is too large. You should try to maintain a database structure that accurately reflects your application. In contrast to providing unlimited logins, if your application restricts logins to 10 characters, you should write the database accordingly, as shown in Listing 2:
Listing 2. Writing migrations with limits and non-empty columns

t.column: login,: string,: limit => 10,: null => false

In addition, you should consider default values and any other information that can be safely provided. With a little preparation, you can save a lot of time tracking down data integrity issues in the future. When considering the database foundation, you should also pay attention to which pages are static and easy to cache. Among the two options of optimizing queries and caching pages, if you can “consume” complexity, caching pages will bring greater returns. Sometimes pages or fragments are purely static, such as a list of states or a set of frequently asked questions. In this case, caching is even better. At other times, you may decide to sacrifice database performance to reduce complexity. For ChangingThePresent, we tried both based on the problem and the circumstances. If you also decide to sacrifice query performance, read on.
N + 1 question

ActiveRecord relationships are lazy by default. This means that the framework waits until the relationship is actually visited. For example, each member will have an address. You can open a console and enter the following command: member = Member.find 1. You can see the following appended to the log, as shown in Listing 3:
Listing 3. Login from Member.find (1)

^ [[4; 35; 1mMember Columns (0.006198) ^ [[0m ^ [[0mSHOW FIELDS FROM members ^ [[0m
^ [[4; 36; 1mMember Load (0.002835) ^ [[0m ^ [[0; 1mSELECT * FROM members WHERE
(members.`id` = 1) ^ [[0m

Member has a relationship to this address and is defined by the macro has_one: address,: as =>: addressable,: dependent =>: destroy. Note that when ActiveRecord loads Member, you will not see the address field. But if you type member.address in the console, you can see what's in Listing 4 in development.log:
Listing 4. Access relationship forces database access

^ [[36; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m
^ [[4; 35; 1mAddress Load (0.252084) ^ [[0m ^ [[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 1 AND addresses.addressable_type = 'Member') LIMIT 1 ^ [[0m
^ [[35; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m

So ActiveRecord does not execute queries for address relationships until you actually access member.address. Usually, this lazy design works well because the persistence framework does not need to move so much data to load members. But if you want to access many members and the addresses of all members, as shown in Listing 5:
Listing 5. Retrieving multiple members by address

Member.find ([1,2,3]). Each {| member | puts member.address.city}

Since you should see a query for each address, the results are not satisfactory in terms of performance. Listing 6 shows the whole problem:
Listing 6.N + 1 query

^ [[4; 36; 1mMember Load (0.004063) ^ [[0m ^ [[0; 1mSELECT * FROM members WHERE
(members.`id` IN (1,2,3)) ^ [[0m
^ [[36; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m
^ [[4; 35; 1mAddress Load (0.000989) ^ [[0m ^ [[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 1 AND addresses.addressable_type = 'Member') LIMIT 1 ^ [[0m
^ [[35; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m
^ [[4; 36; 1mAddress Columns (0.073840) ^ [[0m ^ [[0; 1mSHOW FIELDS FROM addresses ^ [[0m
^ [[4; 35; 1mAddress Load (0.002012) ^ [[0m ^ [[0mSELECT * FROM addresses WHERE
(addresses.addressable_id = 2 AND addresses.addressable_type = 'Member') LIMIT 1 ^ [[0m
^ [[35; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m
^ [[4; 36; 1mAddress Load (0.000792) ^ [[0m ^ [[0; 1mSELECT * FROM addresses WHERE
(addresses.addressable_id = 3 AND addresses.addressable_type = 'Member') LIMIT 1 ^ [[0m
^ [[36; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb: 98: in `find '^ [[0m

The results were as bad as I expected. All members share a query, and each address uses a query. We retrieved three members, so we shared four queries. If there are N members, there will be N + 1 queries. This is the terrible N + 1 problem. Most persistence frameworks use eager association to solve this problem. Rails is no exception. If you need access to the relationship, you can choose to include it in the initial query. ActiveRecord uses the: include option for this purpose. If you change the query to Member.find ([1,2,3],: include =>: address) .each {| member | puts member.address.city}, the result will be slightly better some:
Listing 7. Solving the N + 1 problem

^ [[4; 35; 1mMember Load Including Associations (0.004458) ^ [[0m ^ [
[0mSELECT members.`id` AS t0_r0, members.`type` AS t0_r1,
members.`about_me` AS t0_r2, members.`about_philanthropy`

...

addresses.`id` AS t1_r0, addresses.`address1` AS t1_r1,
addresses.`address2` AS t1_r2, addresses.`city` AS t1_r3,

...

addresses.`addressable_id` AS t1_r8 FROM members
LEFT OUTER JOIN addresses ON addresses.addressable_id
= members.id AND addresses.addressable_type =
'Member' WHERE (members.`id` IN (1,2,3)) ^ [
[0m
^ [[35; 2m./vendor/plugins/paginating_find/lib/paginating_find.rb:
98: in `find '^ [[0m

The query will also be faster. One query retrieves all members and addresses. This is how thermal correlation works.

With ActiveRecord, you can also nest: include options, but only one level deep. For example, this is the case for a Member with multiple contacts and a Contact with an address. If you want to show all cities for a member's contacts, you can use the code shown in Listing 8:
Listing 8: Get a city for a member's contacts

member = Member.find (1)
member.contacts.each {| contact | puts contact.address.city}

The code should work, but it must be queried for this member, each contact, and the address of each contact. You can slightly improve performance by including: contacts with: include =>: contacts. It can be further improved by including both, as shown in Listing 9:
Listing 9: Get a city for a member's contacts

member = Member.find (1)
member.contacts.each {| contact | puts contact.address.city}

Better improvements can also be obtained by using nested include options:

member = Member.find (1,: include => {: contacts =>: address})
member.contacts.each {| contact | puts contact.address.city}

This nested inclusion allows Rails to include the contacts and address relationships. Once you want to use relationships in a given query, you can use hot-loading techniques. This technique is one of the performance optimization techniques we use most frequently at ChangingThePresent.org, but it has some limitations. When it is necessary to join more than two tables, it is better to use SQL. If reporting is required, it's best to simply take a database connection across ActiveRecord and ActiveRecord :: Base.execute ("SELECT * FROM ..."). Generally speaking, thermal correlation is sufficient to solve the problem. Now, I will change the subject and discuss another troublesome issue that Rails developers care about: inheritance.
Inheritance and Rails

When most Rails developers first encounter Rails, they are immediately fascinated. It's too simple. You just need to create a type class on the database table, and then inherit the subclasses from the parent class. Rails takes care of the rest. For example, there is a table named Customer that can inherit from a class named Person. A customer can have all the columns for Person, plus credibility and order history. Listing 10 shows the simplicity of this solution. The main table has all columns of the parent and child classes.
Listing 10. Implementing inheritance

create_table "people" do | t |
t.column "type",: string
t.column "first_name",: string
t.column "last_name",: string
t.column "loyalty_number",: string
end

class Person <ActiveRecord :: Base
end

class Customer <Person
has_many: orders
end

In many ways, this solution works well. The code is simple and non-repetitive. These queries are simple and performant because you don't need to make any joins to access multiple subclasses, ActiveRecord can use the type column to decide which records can be returned.

In some ways, ActiveRecord inheritance is very limited. If the existing inheritance level is very wide, inheritance will fail. For example, at ChangingThePresent, there are many types of content, each type has its own name, a short or long description, some common presentation attributes, and several custom attributes. We hope that cause, nonprofit, gift, member, drive, registry, and other types of objects can inherit from a common base class so that we can handle all types of content in the same way. But we can't, because the Rails model will have the substance of all our object models in a single table, which is not a viable solution.
Explore other options

We tried three solutions to this problem. First, we place each class in its own table and use views to build a common table for the content. We quickly abandoned this solution because Rails doesn't handle database views well.

Our second solution is to use simple polymorphism. With this strategy, each subclass will have its own table. We push common columns into each table. For example, let's say I need a subclass called Content that contains only the name attribute, as well as Gift, Cause, and Nonprofit subclasses. Gift, Nonprofit, and Cause can all have name attributes. Because Ruby is dynamically typed, these subclasses do not need to inherit from a common base class. They only need to respond to the same set of methods. ChangingThePresent uses polymorphism in several places to provide common behavior, especially when processing images.

The third method is to provide a common function, but use association instead of inheritance. ActiveRecord has a feature called polymorphic association, which is great for attaching common behaviors to classes without inheritance at all. You have seen examples of polymorphic associations at the previous Address. I can use the same technique (instead of inheritance) to attach common attributes for content management. Consider a class named ContentBase. Generally, to associate this class with another class, you can use the has_one relationship and a simple foreign key. But you might want to make ContentBase work with multiple classes. In this case, you need a foreign key and a column that defines the type of the target class. And that's exactly what ActiveRecord polymorphic associations are good at. See Listing 11.
Listing 11. Two aspects of the site content relationship

class Cause <ActiveRecord :: Base
has_one: content_base,: as =>: displayable,: dependent =>: destroy
...
end

class Nonprofit <ActiveRecord :: Base
has_one: content_base,: as =>: displayable,: dependent =>: destroy
...
end

class ContentBase <ActiveRecord :: Base
belongs_to: displayable,: polymorphic => true
end

Usually, the belongs_to relationship has only one class, but the relationships in ContentBase are polymorphic. A foreign key has not only an identifier that identifies a record, but also a type that identifies a table. Using this technique, I get many benefits of inheritance. Common functions are included in a single class. But this also brings several side effects. I don't need to put all the columns in Cause and Nonprofit in a single table.

Some database administrators are less optimistic about polymorphic associations because they don't use foreign keys in the true sense, but for ChangingThePresent, we are free to use polymorphic associations. In fact, data models are not as beautiful as they are in theory. You cannot use database features such as referential integrity, and you cannot rely on tools to discover these relationships based on column names. The benefits of a concise object model are more important to us than the problems with this approach.

create_table "content_bases",: force => true do | t |
t.column "short_description",: string

...

t.column "displayable_type",: string
t.column "displayable_id",: integer
end

Concluding remarks

ActiveRecord is a full-featured persistence framework. It can be used to build reliable and scalable systems, but like other database frameworks, you must pay extra attention to the SQL generated by the framework. When you encounter problems occasionally, you must adjust your approach and strategy. Keeping indexes, using hotload with include, and using polymorphic associations instead of inheritance in three places are three ways to improve your code base. Next month, I will take you through another example to see how to write Rails in the real world.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More