C + + Distributed real-time application framework-System Management module

Source: Internet
Author: User

A distributed real-time system cluster is prone to hundreds of machines, the size of the cluster has been limited this will be a "closed" system. You can no longer operate hundreds of machines, the traditional manual operation has not been able to meet the needs of the present, all of the cluster or the operation of a node in the cluster must be provided through the system interface to complete. For a commercial distributed real-time system, how to deal with the sudden peak of the business, the timely detection of the fault nodes in the cluster and the aftermath of treatment, the cluster processing capacity of different nodes for load balancing regulation; system overload protection prior to excessive pressure collapse Test Container and Operation container in the same network test gray-scale publishing ability and so on. These are the System Management module needs to solve the problem, but also a system can be commercially available, smart enough key indicators.

The System Management module is divided into two parts: service (Smartservice) and Management (Smartmanger). Smartservice based on the RESTful interface, to provide various types of cluster query and operation interface, can conveniently and all kinds of management terminal (PC, IOS, Android) docking, to achieve interface management. The complete framework also provides an easy two-time development interface for easy customization of system-specific interfaces. such as: Adjustment log level, single-number log tracking, cluster configuration management, cluster real-time topology data query and so on. Hundreds of machines of the cluster, manual maintenance is no longer realistic, automatic detection and autonomous operation has become the key, Smartmanger automatic load management function is used to complete this part of the function. In addition, the System Management module is with the State center, communication platform to work with each other, three indispensable.

Each feature is described in detail below:

First, automatic load management

According to the information such as the time delay, type, traffic, etc. of the node of the business container, the information of all nodes in the cluster is synthesized, and the present situation of cluster is judged, and the corresponding action is made according to the situation.

1. A container is faulty, unable to handle the business properly--failure node exit

2. A container processing capacity is insufficient, there is a business processing time-out situation--the node traffic control

3. A class of container processing capacity is insufficient, such containers have a business processing time-out-the container of such containers to expand the operation

4. The capacity of a class of containers is abundant, and the flow of such containers satisfies the condition of the shrinkage-------

5. Cluster processing capability is up to the limit, a system crash may occur-overload protection for the cluster

Second, the fault node automatically back up the network

When a business node encounters an unrecoverable failure that is no longer able to handle the business normally, the System Management module automatically checks out and exits the failed node to the business cluster to ensure that the cluster is functioning properly.

Three, node flow control

When a node is not capable of processing, such as when the node is doing log tracking, the System Management module can reduce the number of messages sent to that node based on node processing ability, and do real-time load balancing.

Iv. Dynamic Scaling capacity

When the processing capacity of a certain type of business container is not enough, the system can automatically expand online, and the business will not be affected during the expansion. When the processing power is surplus, the system also automatically shrinks the online, in order to give up resources to require business.

V. Protection of Node overload

When the entire cluster processing capacity has reached the limit (no further expansion operations), in order to prevent the system crashes, can be based on the business situation of overload protection, such as: the initial authentication request to discard processing.

Six, Grayscale Publishing

The system supports the gray-scale publishing ability, can let the test point section and normal business node to run the same network, will let the test number routed to the test node for processing, without affecting other normal numbers.

C + + Distributed real-time application framework-System Management module

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.