MongoDB Storage Log

Source: Internet
Author: User
Tags mongodb

Recently has been thinking about the architecture of things, there is a problem still bothering us to do business systems, that is, log and log statistics. The general question is as follows: We have a lot of modules, although the log format is similar but written in the respective server and directory. There is a lot of information in the log that is Key=>value formatted data. Usually after a function is on-line, PM or demand side will require some statistics and reports and so on, used to track the use of the function effect. Usually PM does not know how to write programs, so most of the statistical data is also referred to Rd. This kind of statistic data and report, the value decreases with the time, at some time no longer has the value, no longer has the person to care, but the statistic procedure still is running, discouragement which day to maintain, all forgot to deploy in where. Log storage footprint, you need to periodically remove the Web server image a lot, log is often multiple, processing requires merging Web server sometimes to adjust, offline Web server general log is lost

I am a lazy person, do a good job, the need for this data statistics is often very offensive, because the data mining this kind of thing is generally a test of inspiration, so the total change in demand, today to this number, tomorrow to that number. An ideal situation is the PM will SQL, and then Rd to the data are poured into the database, before our group of the several NB PM still in the time, often do so, now is not. Another problem is that the database is not the schema free, the format is not so liberal, needs to be designed in advance, which also jiabuzhu the need for old change.

Log statistics This kind of thing, usually has the following characteristics: Big data volume, every day may have a g of data (business data) write frequently, read infrequently (almost every PV will produce several log data) statistical services can be mission-specific, do not need real-time absolute data consistency

According to this feature, MongoDB is a very suitable choice, because: schema free, can add the required fields at any time to extensibility is very good, do not worry too much storage space is not enough to write the time can be asynchronous, Don't worry too much about the time it takes to request a response. for collection, you can specify a fixed size (capped collection), such as 100G, so MongoDB will follow the LRU algorithm to use space, do not want to delete the log Can support general query conditions and aggregation, and provide JavaScript Shell, so that interested in their own analysis of the data of the PM self-learning to write statistical scripts, eventually let Rd out of such work

Although the product awareness of the development of RD is good, but the statistical product use of data such things, really let Rd does not interest, the previous department has a product, from the product line to fetch data and then record in the database and provide report display, but generally the flexibility is very low, one of the two sides to set the interface, Secondly, the statistics of the matter is still to do the RD, but save to do the work of data presentation.

The idea now is to build a MongoDB cluster to centralize the data for the business log, and then build a platform on top of MongoDB to handle the general data statistics requirements, allowing the writing of tasks to run on the platform, which can be written in a unified JavaScript language. For the relatively small amount of data (our business system, compared to the log at the retrieval side is small data, a day on the G data is large) of the demand, is a good solution, the main purpose is to solve the maintenance and management problems.

ref:http://blog.csdn.net/sandysong28/article/details/6455926

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.