Storing polymorphic message/Alert class data with MongoDB

Source: Internet
Author: User
Tags findone serialization ruby on rails

Original: HTTP://CODECAMPO.COM/TOPICS/66

The day before yesterday saw Javaeye plan to use MongoDB to implement the website of the whole station message system, very sympathetic, MongoDB is very suitable for storing message class data. We discussed how to build a micro-BO broadcast, this time to discuss how to store the message/reminder class data.

The following content does not address the issue of massive data storage, only the data schema.

1. Requirements

There are a number of examples of message/alert data such as Tweet,sns's friend Radio (I say, movie/book read status, URL recommendation, etc.), Twitter's push message to the friend status.

One characteristic of this kind of information is the pattern changeable, the watercress's friend broadcast has several kinds of modes, "I say" the text which the user publishes mainly, the action message (upload what) does not have the text, but needs to relate the other data, for example the book, the picture, the recommendation message to carry the text and the correlation data. Twitter tweets need to store a wide variety of information, such as the mention to users, which tweets to reply to, the URLs that come with them, and geographic locations, but sometimes the data is empty. Here you can see how much content to read from a Twitter message.

In general, the key word is "changeable", and as the application upgrades, the status information will add more patterns and more items.

2. Using Mongodb to store polymorphic messages

Now take Codecampo's example directly to illustrate how to store such polymorphic data in Mongodb. Campo's code uses Ruby on Rails and Mongoid, and the full code can be seen in the GitHub repository.

The definition of the message in Codecampo is that the user is advised to view it immediately, burn it after reading it, and the expiration will be deleted, so it is designed to be stored inside the user document and has a limit (automatically delete the oldest). If you need persistent stored messages (such as microblogging messages), you can replace the inline (Embed)with a reference (DBREF) , and store the notification separately in a collection.

Patterns in 2.1 MongoDB

Ideally, MongoDB will save notification data in this way. (Note: Notification::follower and Notification::other are not implemented, just as an example)

> Db.users.findOne () {    _id:objectid (...),    ...    Notifications: [        {            _id:objectid (...),            _type: ' notification::mention ',            replyer_id:objectid (...), C9/>topic_id:objectid (...),            reply_id:objectid (...),            text: ' @rei some message '        }        {            _id: ObjectId (...),            _type: ' Notification::follower ',            follower_id:objectid (...)        }        {            _id:objectid (...),            _type: ' Notification::other ',            other_column: ' Value '        }    ]}
2.2 Implementation with Mongoid

If you are familiar with Mongodb, you should have a general idea of how to manipulate the above documents. Here's how to implement such a data structure with mongoid (if you're unfamiliar with mongoid, you might want to look at its documentation, especially the Inheritance section.) )

First create a notification::base to associate with the User.

Class Notification::base  include Mongoid::D ocument  include mongoid::timestamps  field:text  Embedded_in:user,: inverse_of =: notificationsend

When another class inherits Notification::base, all its associated definitions are inherited.

The embed is then defined in User.

Class User  include Mongoid::D ocument  include mongoid::timestamps ...  Embeds_many:notifications,: class_name = ' notification::base ' ...  end

Now, you can use @user. Notifications.create (attributes) method to create a message alert. But the default use of Notification::base is not the type of message that eventually needs to be created, so continue creating a new notification::mention.

Class Notification::mention < notification::base  referenced_in:topic  referenced_in:reply  Referenced_in:reply_user,: class_name = ' user ' end

Note that this mention class does not define a embed relationship with the user, but because it inherits the Notification::base, it inherits the Base's module and the EMBED association. The mention class only needs to define the logic of its own parts.

Now, the Ruby code that creates a mention message would be:

@user. Notifications.create ({: reply_user_id  = user_id,                            : topic_id       = topic_id,                            : reply_id       = = reply_id,                            : Text           = ' summary text ',                            notification::mention)

The data saved to MongoDB is as follows

> Db.users.findOne () {    _id:objectid (...),    ...    Notifications: [        {            _id:objectid (...),            _type: ' notification::mention ',            replyer_id:objectid (...), C7/>topic_id:objectid (...),            reply_id:objectid (...),            text: ' Summary text '        }        ...    }

The saved data is the same as the ideal. Need to add a new message type, just modeled notification::mention, the creation of a novel notification::base subclass can be.

3. How is SQL database implemented?

Both watercress and Twitter use MySQL to store broadcast and push data, so how do they implement such polymorphic data structures? I do not know their internal situation, but how SQL to achieve polymorphism there are many articles (such as the Railway book introduction ActiveRecord support polymorphism and inheritance), here to give some solutions to do contrast.

3.1 Single-Table inheritance

The simple thing is to map a table to a different model. How do we do that? The method is to save all the fields involved in the entire inheritance system within a table. For example

Notifications (ID, type, user_id, reply_id, topic_id, replyer_id, text, ...) )

The field that distinguishes the message type is type, where the application layer applies different logic depending on the type. However, even if a certain type of message (such as a follower reminder) does not use all of the fields, it needs to be saved in the library in a row of records from the database.

Obviously, this will bring a lot of empty fields, affecting the purity of the table. Even if you try to merge and reuse some of the fields, as the application progresses, maintenance and migration will become a problem. It should be noted that using the Polymorphic Association of Method 2, it is also possible to gradually walk into the crossroads of field reuse in order to reduce the number of tables.

3.2 polymorphic Correlation

Another way to implement heterogeneous object aggregation is polymorphic correlation. It works by referencing multiple tables in a single table with a field that is polymorphic. For example:

Notifications (ID, user_id, type, entry_id) mention_nofitications (ID, reply_id, topic_id, replyer_id, text) follower_ Notifications (ID, follower_id) other ...

The associated logic relies on the type of the notifications and the entry_id field, and the value of type can take "mention", "follower", and so on, to choose which xxx_notifications table data to read.

Polymorphic Correlation maintains the purity of the table well, but one drawback is that you cannot use a JOIN query, which can lead to N + 1 query problems (perhaps the SQL expert may tell me how to isolate different types of messages in a query, but the logic of SQL can be expected to be more complex, and too many tables of join will affect efficiency).

If you use this method, it is best to add a cache layer to the database, cache the full message data that is fetched, and reduce the database query. Twitter has a Row Cache layer, which is estimated to be used to do this.

3.3 Save after serialization

Another option is to serialize the various fields and store them after each read to determine the type of content after deserialization. This saves a lot of table fields and avoids the problem of N + 1 queries.

Notifications (ID, user_id, serialized_entry)

This scheme is actually good, a disadvantage is not easy to do the follow-up processing, such as using serialization to save a tweet information mention user ID, then can not be in turn to query which message mention a user. This makes it possible to separate the information that needs to be queried into fields, and to avoid the problem of empty fields in some cases.

4. Summary

After comparing the implementation of the above several polymorphic data, the scheme of MongoDb is considered to be more elegant. SQL database typically requires a cache layer to cover when storing data for complex structures. MongoDb built-in to the complex structure of the storage support, the development of less difficult (a layer, less a worry). So using MONGODB to develop web programs can really reduce a lot of technical costs.

Limited to the field of vision, there may be some good ways I have never seen and thought, welcome message to tell me these methods.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.