Use MongoDB to store logs ___ Storage

Source: Internet
Author: User
Tags mongodb

I've been thinking about architecture lately, and there's a problem that's still bothering us with the business system, which is log and log statistics. The approximate questions are as follows: We have a lot of modules, although the log format is similar but is written in their respective servers and directories. A lot of the information in the log is Key=>value format data. Usually a function on the line, PM or demand side will require some statistics and reports, and so on to track the use of the function effect. Usually PM does not know how to write a program, so most of the statistics are referred to Rd. Such statistics and statements, the value of the passage of time and diminishing, to a certain time no longer valuable, no one cares, and the statistics program is still running, Baobuzzi which day to maintain, have forgotten where the deployment. Log storage footprint, need to periodically delete the Web server mirror many, log is often multiple copies, processing needs to merge the Web server sometimes to adjust, offline Web server general log is lost

I am a lazy person, after doing the function, the demand for this kind of data statistics is usually very objectionable, because the data mining this kind of thing is in general very test inspiration, so the total demand change, today want such number, tomorrow want that number. An ideal situation is the PM will be SQL, and then Rd to the data into the database, the previous group of the NB of the PM is still in, often this way, now is not. There is also a problem is that the database is not a schema free, the format is not so liberal, need to design well in advance, this also jiabuzhu demand change.

Log statistics This kind of thing, usually has the following characteristics: Large data volume, every day may have on the G data (business data) write frequently, read infrequently (almost every PV will produce a number of log data) statistical services can be task-based, do not need to be in real time absolute data consistency

According to this feature, MongoDB is a very appropriate choice, because: schema free, can add the required fields at any time the scalability is excellent, do not worry too little storage space to write can be asynchronous, do not worry too much to occupy request response time for the collection, Can specify a fixed size (capped collection), such as 100G, so MongoDB will use the LRU algorithm to the space, do not worry about the deletion of log can support general query conditions and aggregation, and provide JavaScript Shell, This allows the PM who is interested in analyzing the data to learn to write a statistical script, and finally let Rd get rid of this work

Although it is good to cultivate the product awareness of RD, but statistical products using data such things, really make Rd can not interest, the previous department has a product, from the product line crawl data and then recorded in the database and provide report presentation, but overall flexibility is very low, one side to set the interface, The other thing is to make a good rd, just save the work of data display.

The idea now is to build a MongoDB cluster that centralizes the data for the business log, and then builds a platform on the MongoDB to handle general data-statistics requirements, allowing you to write tasks that run on a platform that can be written in a unified JavaScript language. For the relatively small amount of data (our business system, compared to the retrieval end of the log is a small amount of data, the day G data are large) needs, is a good solution, the main purpose is to solve the maintenance and management problems.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.