. NET Data statistics system design (small and medium size)

Source: Internet
Author: User

Nearly a year has not been in the blog Park to write things, from the company has been working hard to learn the company's framework and business. Now take over an e-commerce data statistics project, in the blog Park Search statistics Project Solutions but nothing, and finally design and continue to develop in the process of continuous updating, I hope to communicate with you.

Demand

The project group's e-commerce system has been operating for more than 3 years, with a single daily order of about 2w.

1. Request from the merchant, customer, commodity angle statistics daily, monthly, any number of days query demand.

2. And make real-time queries on some sensitive data, such as the next singular and the amount of the order.

3. To the customer, the commodity turnover volume indicator to do top and support export.

4. Statistics of the various trading indicators by region and top.

5. Do not integrate on legacy systems to prevent impact to the main business.

Design ideas

The server uses the windows2012 IIS7.5. netfromwork4.0 database is part of SQL Server and part of MySQL based on current system status and requirements, decided to use triggered pre-statistical method to complete some important functions.

What is the trigger type? At present, communication between multiple systems has a message middleware to handle the business. For example, the order is placed, when orders are created and then directly to the message system to register a create order task, the task system will be distributed to the corresponding performers to carry out the next order related tasks.

According to the characteristics of the message system, in the case of the main system without the use of orders, payments, such as business triggers, the completion of data collection, processing, which is the trigger.

Non-real-time parts

About pre-statistics, in fact, as long as the statistics, are not around this method. When the data is small, we generally use the whole table query, Count,groupby and other forms to obtain the results. However, in most cases, it will be repeated to obtain a large amount of data, each time, not only time-consuming, but also in database performance, data transmission, timeouts and other issues are not tolerated.

Therefore, we will summarize, will be able to increment the statistics of the part to take a little bit of superposition, daily statistics increment data, the reasonable data structure stored (semi-finished), the next time the statistics or query directly check semi-finished products, not only the amount of data, and do not have to traverse the original table. In particular, the e-commerce system, daily incremental data large, will use a sub-library table to improve the performance of the database and access speed, but the statistics caused by the problem.

Real-time Section

For some real-time requirements of high-performance indicators, we need to meet the requirements of flow-based computing. The same as the trigger-type task, in the task to maintain a global memory variables, different businesses can also be separated, in addition, due to data real-time and memory size considerations, need a timed push Redis, a timed quantitative to remove storage to the database.

Regarding timing quantification, it means that the memory accumulates to a certain number of items such as customer trading products accumulated to 100 or the last storage has been 1 minutes, this time will trigger the storage rules. Push is also the same, said in real-time, but there is no change can be pushed to Redis, then Redis is meaningless. According to my idea, each trader has to push once per minute, after the next task triggered according to the last push time to decide, if no transaction, of course, do not push.

Library table Design

The library table of the statistics project I used a check from the library, the library has the same table structure as the main library, and it is read-only and not written. In principle, the table is designed to be statistical tables, pre-statistical tables, monthly tables by monthly table and according to the data of the table, so that the expected data in each table is not more than 10 million, I will part of the large number of tables divided into 128, a small amount of 16 pieces

In extreme cases, you can divide the library to increase the number of connections.

Process Design

In terms of improving system performance and responsiveness, a long cache + short cache approach is used. For example, unchanging universal data with long cache (2-24 hours), real-time demand for sensitive data (1-5 minutes)

For some of the slow places can also use the home page cache, the default parameter cache, and so on to improve the speed of opening pages.

Message System Architecture

Triggered statistics is based on the message middleware, so it's architecture and design ideas, for everyone to reference and learn from. There are several advantages to the existence of a messaging system, 1. Decoupling system Dependencies 2. Asynchronous execution time-consuming task 3. Reliable logging and retry mechanism 4. Distributed deployment Scale-out

Statistical Task Center

The core of the trigger statistic is two, one is the message system and the other is the statistic Task Center. It is primarily responsible for collecting transaction data, managing pre-statistical results and pushing (Redis and database). Has the following payment as an example, it is in the global single case, and is responsible for collecting data, management (statistics, storage, push)

Summarize

This time, I will be the general structure of statistical projects and design ideas shared out, the current project has not formally started, the latter may be adjusted to change the unreasonable design. I intend to share the statistical project as a series in the way of realization, and hope that someone will benefit, and someone can give advice.

. NET Data statistics system design (small and medium size)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.