# R & D solution # distributed parallel computing scheduling and management system Summoner,

Source: Internet
Author: User

# R & D solution # distributed parallel computing scheduling and management system Summoner,
Zheng Yu was created on and last updated on. Keywords: commission calculation, scheduled tasks, data extraction, data cleansing, data computing, Java, Redis, MySQL, Zookeeper, azkaban2, oozie, mesos outlines:

Summoner is a distributed parallel computing scheduling and management system based on MySQL + Redis + Zookeeper launched by the guoxi department. 0x00. Why is "data" Parallel Computing scheduling required?Everyone may have done this. Large-scale, step-by-step, and dependent data Computing Based on MySQL Databases. You may have defined a bunch of timer tasks that depend on each other, or you may write them into a large process to run. For example, in our O2O business system, I have to work with 3000 or 4000 people, have multiple business lines, and the organizational structure is the sales team of the region-city-Sales Group yesterday's Commission and the current month's Commission. The challenge here is:
  • Involving merchants, stores, transactions, discounts, and materials to be written off. The data volume is large and must be calculated at least once a day,
  • The formula for calculating incentive policies and commissions changes once a month or two as the competition changes,
  • Data extraction affects normal services as little as possible,
  • After the computing logic is adjusted, you must be able to quickly deploy and run it.
Therefore, some scheduled tasks may be defined in the past, which are extracted from various business databases (after all, all database shards and table shards) every morning: both full data and incremental data, therefore, the data volume is large. Calculate the number of contracts, number of shops, and transaction volume, and then the performance is attributed to BD. Various calculations are conducted for BD, BD supervisor, and city manager based on the Commission calculation formula of different business lines. Although our JobCenter is an excellent scheduled task scheduling and management platform, it does not have the concept of steps (that is, dependencies between scheduled tasks ).In the past, we had to shoot our heads to execute Job1 at a.m., Job2 at a.m., and Job3 and Job4 at a.m.. Obviously, this is just helpless, what if Job1 runs to AM? What if Job1 fails to be executed? What are steps? We can understand the dependencies between the steps of a large computing task: Figure 1 to cope with the large data extraction and one-ring computing, we need A user-friendly, step-by-step data computing system with cluster scheduling will be developed separately.To make full use of machine resources. 0x01, Yu of his mountains: azkaban2/oozie/mesosThe scheduling of computing resources has a lot to learn from, such as azkaban2 and oozie for hadoop cluster scheduling and management. apache mesos, a distributed resource management framework with higher abstract capabilities, is provided. At the beginning of the project, I hope to learn from some excellent design ideas of oozie and azkaban2. We actually do scheduling and management, but they are based on hadoop and mysql. I was deeply impressed by the characteristics of the scheduling system: So they started to design the system. The final result is not bad. Next we will introduce it. Later, we used apache mesos for the private cloud of containers. I think mesos, a highly abstract resource scheduling and management system, is very suitable for our Data Parallel Computing application scenarios, so I assume: we write the scheduler to communicate with mesos and tell it what command to execute. It is responsible for scheduling in the entire cluster. The project we write and the console is a bit like marathon, relying on mesos + chronos, we write code for extracting raw data from different data sources and calculating Commission, compress it into a jar package, and place it on the mesos master. After configuration, when mesos slave really receives the scheduling command to run it, it will download and execute the jar package from the master node, blabla ...... In this way, mesos can save a lot of development work for us. 0x02, Summoner featuresThe following describes the distributed parallel computing scheduling system for data computing-Summoner ). We name a large computing task as a "workflow". A workflow contains multiple tasks, allowing you to visually establish dependencies between tasks. The Quartz cron representation can be set for workflow execution on a regular basis. You can intuitively view the progress, execution logs, exception logs, and status of the task execution. We can also reuse tasks. A task can belong to multiple workflows. In this way, when the Commission calculation rules change, we only need to reuse some of the tasks, add some new tasks, and create a new workflow to concatenate the tasks. At the same time, we can disable the original workflow. The client (jar package) responsible for task execution can automatically register (through Zookeeper), so the system knows how many machine nodes can execute a task. Therefore, if Task B has 10 clients registered, task A extracts 10 million transaction records, and the system splits the records into 10 records and sends them to 10 Task B clients, therefore, Task B will perform parallel computing on multiple machine nodes, and then the system will schedule task C. Its menu functions include:
  • Resource Configuration Management
    • Workflow Management
    • Task Management
    • Dependency Management
    • Registration Management (client registration and Server Registration)
  • Task Scheduling Management
    • Scheduling Management
  • Real-time Data Management
    • Workflow execution
  • Scheduling log management
    • Scheduling log
The following is the homepage workbench. We can see how many workflows are executed, failed, paused, and canceled under our account, as well as system alarms and information notifications. Figure 2 summoner homepage workbench first, we need to create a workflow: Figure 3 resource configuration management-workflow management we also need to create a task. The real executor of a task is a Java-implemented task processing class: figure 4 task management Figure 5 editing a task Secondly, we need to establish the dependency between tasks: Figure 6 dependency management and then manage the workflow: figure 7 workflow graph management enables the workflow to be executed immediately to observe its progress: Figure 8 scheduling log management and the progress of each task: figure 9 workflow execution details different nodes in the cluster may be involved in workflow execution. The logs generated by these nodes are aggregated by flume and then displayed on the platform in Real Time: figure 10 workflow execution log figure 11 client registration figure 12 Server registration figure 13 system notification Summoner is JobCenterThey have their own application scenarios. We will also learn from the advanced concepts of mesos to further improve the cluster scheduling capability of Summoner. -EOF-20160108 Note: At the end of 2015, Dangdang also saw its "distributed task scheduling framework: 10 features of Dangdang elastic-job open source project", which has some similar concepts, for example, task sharding and distribution are worth further learning.

Welcome to read my other e-commerce articles: Welcome to subscribe to my subscription number "Veterans note". Please scan the QR code to follow:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.