Source Analysis Elasticjob Task Miss Mechanism (misfire) and idempotent

Source: Internet
Author: User
Tags event listener

Task in the dispatch execution, for some reason is not completed, after the next scheduled task trigger, in the same job instance, there will be two threads processing the data on the same shard, which will cause two threads may process to the same data. In order to avoid the problem that the same data may be executed more than once, Elasticjob introduces a idempotent mechanism to ensure that the same data is not processed by multiple jobs at the same time, and that the same piece of data is processed by multiple threads in the same job instance. Once again, the elastcijob distribution is the distribution of data, a task runs on multiple job instances, and each job instance processes part of the data (data shards) of the job.
This paper focuses on how elasticjob is doing the following two points.
1) Elasticjob How to ensure that multiple threads in the same job instance do not process the same data.
2) Elasticjob How to ensure that data is not processed by multiple job instances.
In order to solve the above situation, Elasticjob introduces task missed compensation execution (misfire) and idempotent mechanism (monitorexecution)

1. Elasticjob How to ensure that multiple threads in the same job instance do not process the same data.
Scenario: For example, the task scheduling cycle for every 5s execution, normal each scheduling task processing takes time 2s, if in a certain period of time due to the database pressure, resulting in the original only need 2s can handle the completed task, now need 16s to run, in this process of data processing, Every 5s will trigger a schedule (task processing), if not controlled, in the same instance, according to the conditions of the Shard to query the database, the query to the data may be the same (part of the same), so that the same task data will be run multiple times, if the task is to deal with the transfer business, if the business method is not implemented idempotent Will cause a very serious problem, whether the elasticjob can avoid this problem.
The answer is yes. Elasticjob provides a configuration parameter: Monitorexecution=true, turn on idempotent.
After a task is triggered, the task processing logic is executed, its entry: abstractelasticjobexecutor#misfireifrunning

if (Jobfacade.misfireifrunning (Shardingcontexts.getshardingitemparameters (). KeySet ())) {  //@1
       if ( Shardingcontexts.isallowsendjobevent ()) {  //@2
             jobfacade.postjobstatustraceevent ( Shardingcontexts.gettaskid (), state.task_finished, String.Format (
                    "Previous job '%s '-shardingitems '%s ' is still Running, misfired job would start after previous job completed. ", JobName, 
                    Shardingcontexts.getshardingitemparameters (). KeySet ()));
       }
      return;
}

Code @1: After a scheduled task is triggered, if the last task has not been executed, you need to set the Shard state to Mirefire, which indicates that a task execution was missed.
Code @2: If the Shard is set to Mirefire and event tracking is turned on, event tracking is saved in the database.
Next, the implementation logic of jobfacade.misfireifrunning is analyzed in detail:

/**
     * If the current Shard item is still running, set the flag that the task is missing to execute.
     * 
     * @param items need to be set missing performed task shard Item
     * @return whether to miss this execution *
     /Public
    Boolean Misfireifhasrunningitems (final Collection<integer> items) {
        if (!hasrunningitems (items)) {
            return false;
        }
        Setmisfire (items);
        return true;
    }

If an incomplete shard exists, the Setmisfire (items) method is called, and Elasticjob is created at the start of the Shard task, when the Monitorexecution (true) "idempotent mechanism" mechanism is turned on ${namespace}/ Jobname/sharding/{item}/running node, the directory is deleted at the end of the task, so it is only necessary to determine if any of the nodes are present when a shard is running. If present, call the Setmisfire method.
PS: \${namespace}/jobname/sharding is created if Elasticjob is turned on idempotent (monitorexecution)
/{item}/running,misfire mechanism to take effect.
Executionservice#setmisfire

/**
     * Set the tag that the task was missed to execute.
     *
     * @param items need to set missing task shard Item *
    /public void Setmisfire (final collection<integer> items) {
        for (int each:items) {
            jobnodestorage.createjobnodeifneeded (Shardingnode.getmisfirenode (each));
        }
    }

Set the misfire method to create a persistent node ${namespace}/jobname/shading/{item}/misfire node for all shards that are assigned to the instance, and note that any one of the shards assigned to the instance is not completed. All shards under that instance increase the misfire node, and then ignore this task to trigger execution, waiting for the task to finish.
Abstractelasticjobexecutor#execute

Execute (shardingcontexts, JobExecutionEvent.ExecutionSource.NORMAL_TRIGGER);
     while (Jobfacade.isexecutemisfired (Shardingcontexts.getshardingitemparameters (). KeySet ())) {
         Jobfacade.clearmisfire (Shardingcontexts.getshardingitemparameters (). KeySet ());
        Execute (shardingcontexts, JobExecutionEvent.ExecutionSource.MISFIRE);
}

After the task executes, check for the presence of the ${namespace}/jobname/sharding/{item}/misfire node and, if present, first clear the Misfie related files and then perform the task.
Elasticjob's Misfire Implementation Scenario summary:
After the next dispatch cycle arrives, all shards of the Shard are set to misfire as soon as any one of the shards is found to be executing, and then the next task schedule is uniformly executed when the task is finished.

2. Elasticjob How to ensure that data is not processed by multiple job instances
Elasticjob based on data sharding, different shards according to the Shard parameters (Human configuration), from the database to query their own data (task Data shard), if the node is down, the data will be re-shard, if the task is not completed, and then execute the Shard, the data will be processed by different tasks at the same time.
The answer is no, because when the node is down, the need to re-shard the event listener will listen to the job instance represents the deletion of the node, set the re-shard, before the task is dispatched to execute the specific processing logic, need to re-shard, re-shard the premise is to all the shards of the task to complete the execution, This also depends on whether to turn on power control (monitorexecution), if enabled, Elasticjob can perceive the Shard that is executing the processing logic, re-sharding needs to wait until all the current tasks are complete before triggering, so there will be no different nodes to deal with the same data problems.

Question:
1, if a task job scheduling frequency for every 10s, at a certain time, the job execution time spent 33s (usually just do 5s), according to normal scheduling, should follow the trigger 3 times, then the job after execution, will continue to execute 3 times the dispatch.
Answer: After the completion of this task in 33s, if the subsequent task execution within 10s execution, will only trigger once, not compensate 3 times, because the Elasticjob record task missed execution, just created the misfire node, and will not record the missing at this time, because there is no need.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.