Full solution of timed tasks in Distributed System (III.)

Last Update:2016-06-21 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview

The first two articles from the Java language in the basic implementation of scheduled tasks, to third-party framework relies on the common implementation of the method has been mentioned.

The next section will be long, will be used from the elastic-job, the problems encountered in the use of the elastic-job and several facets of the principle of 3 of the majority.

Integrated Elastic-job

1. First introduce MAVEN warehouse

<!-- 引入elastic-job核心模块 --><dependency>    <groupId>com.dangdang</groupId>    <artifactId>elastic-job-core</artifactId>    <version>1.1.0</version></dependency><!-- 使用springframework自定义命名空间时引入 --><dependency>    <groupId>com.dangdang</groupId>    <artifactId>elastic-job-spring</artifactId>    <version>1.1.0</version></dependency>

2. Implement your own defined jobs

@Componentpublicclass MyElasticJob extends AbstractSimpleElasticJob {    @Override    publicvoidprocess(JobExecutionMultipleShardingContext context) {        // do something by sharding items    }}

3. Configure the Job

<?xml version="1.0"encoding="UTF-8"? ><beans xmlns="Http://www.springframework.org/schema/beans"Xmlns:xsi="Http://www.w3.org/2001/XMLSchema-instance"xmlns:context="Http://www.springframework.org/schema/context"xmlns:reg="Http://www.dangdang.com/schema/ddframe/reg"xmlns:job="Http://www.dangdang.com/schema/ddframe/job"xsi:schemalocation="Http://www.springframework.org/schema/beans Http://www.springframework.org/schema/beans/sprin G-beans.xsd Http://www.springframework.org/schema/context Http://www.sprin                         Gframework.org/schema/context/spring-context.xsd Http://www.dangdang.com/schema/ddframe/reg Http://www.dangdang.com/schema/ddframe/reg/reg.xsd Http://www.dangdang.com/schem A/ddframe/job http://www.dangdang.com/schema/ddframe/job/job.xsd "> <context:component-scan base- Package="Com.dangdang.example.elasticjob"/> <context:property-placeholder location="Classpath:conf/*.properties"/> <reg:zookeeper id="Regcenter"server-lists="${serverlists}"Namespace="${namespace}"base-sleep-time-milliseconds="${basesleeptimemilliseconds}"max-sleep-time-milliseconds="${maxsleeptimemilliseconds}"max-retries="${maxretries}"nested-port="${nestedport}"Nested-data-dir="${nesteddatadir}"/> <job:simple id="Simpleelasticjob"class="Com.dangdang.example.elasticjob.spring.job.SimpleJobDemo"registry-center-ref="Regcenter"Sharding-total-count="${simplejob.shardingtotalcount}"cron="${simplejob.cron}"sharding-item-parameters="${simplejob.shardingitemparameters}"monitor-execution="${simplejob.monitorexecution}"monitor-port="${simplejob.monitorport}"Failover="${simplejob.failover}"description="${simplejob.description}"Disabled="${simplejob.disabled}"Overwrite="${simplejob.overwrite}"/> <job:dataflow id="Throughputdataflowjob"class="Com.dangdang.example.elasticjob.spring.job.ThroughputDataFlowJobDemo"registry-center-ref="Regcenter"Sharding-total-count="${throughputdataflowjob.shardingtotalcount}"cron="${throughputdataflowjob.cron}"sharding-item-parameters="${throughputdataflowjob.shardingitemparameters}"monitor-execution="${throughputdataflowjob.monitorexecution}"Failover="${throughputdataflowjob.failover}"process-count-interval-seconds="${throughputdataflowjob.processcountintervalseconds}"Concurrent-data-process-thread-count="${throughputdataflowjob.concurrentdataprocessthreadcount}"description="${throughputdataflowjob.description}"Disabled="${throughputdataflowjob.disabled}"Overwrite="${throughputdataflowjob.overwrite}"streaming-process="${throughputdataflowjob.streamingprocess}"/> <job:dataflow id="Sequencedataflowjob"class="Com.dangdang.example.elasticjob.spring.job.SequenceDataFlowJobDemo"registry-center-ref="Regcenter"Sharding-total-count="${sequencedataflowjob.shardingtotalcount}"cron="${sequencedataflowjob.cron}"sharding-item-parameters="${sequencedataflowjob.shardingitemparameters}"monitor-execution="${sequencedataflowjob.monitorexecution}"Failover="${sequencedataflowjob.failover}"process-count-interval-seconds="${sequencedataflowjob.processcountintervalseconds}"max-time-diff-seconds="${sequencedataflowjob.maxtimediffseconds}"description="${sequencedataflowjob.description}"Disabled="${sequencedataflowjob.disabled}"Overwrite="${sequencedataflowjob.overwrite}"/></beans>

Property file Definition:

#job. propertiessimplejob.cron=0/5* * * * *? simplejob.shardingtotalcount=Tensimplejob.shardingitemparameters=0=a,1=b,2=c,3=d,4=e,5=f,6=g,7=h,8=i,9=jsimplejob.monitorexecution=falseSimplejob.failover=trueSimplejob.description=\u53ea\u8fd0\u884c\u4e00\u6b21\u7684\u4f5c\u4e1a\u793a\u4f8bsimplejob.disabled=falseSimplejob.overwrite=truesimplejob.monitorport=9888throughputdataflowjob.cron=0/5* * * * *? throughputdataflowjob.shardingtotalcount=Tenthroughputdataflowjob.shardingitemparameters=0=a,1=b,2=c,3=d,4=e,5=f,6=g,7=h,8=i,9=jthroughputdataflowjob.monitorexecution=trueThroughputdataflowjob.failover=truethroughputdataflowjob.processcountintervalseconds=TenThroughputdataflowjob.concurrentdataprocessthreadcount=3Throughputdataflowjob.description=\u4e0d\u505c\u6b62\u8fd0\u884c\u7684\u4f5c\u4e1a\u793a\ U4f8bthroughputdataflowjob.disabled=falseThroughputdataflowjob.overwrite=truethroughputdataflowjob.streamingprocess=truesequencedataflowjob.cron=0/5* * * * *? sequencedataflowjob.shardingtotalcount=Tensequencedataflowjob.shardingitemparameters=0=a,1=b,2=c,3=d,4=e,5=f,6=g,7=h,8=i,9=jsequencedataflowjob.maxtimediffseconds=-1sequencedataflowjob.monitorexecution=trueSequencedataflowjob.failover=truesequencedataflowjob.processcountintervalseconds=TenSequencedataflowjob.description=\u6309\u987a\u5e8f\u4e0d\u505c\u6b62\u8fd0\u884c\u7684\u4f5c\u4e1a\u793a\ U4f8bsequencedataflowjob.disabled=falseSequencedataflowjob.overwrite=true

#reg.propertiesserverLists=localhost:4181namespace=elasticjob-examplebaseSleepTimeMilliseconds=1000maxSleepTimeMilliseconds=3000maxRetries=3nestedPort=4181nestedDataDir=target/test_zk_data/

Problems encountered in integration The 1.cron expression is always the same as the first run-time configuration, unchanged

Because, whether it is an example given by Elastic-job GitHub, or an example given online, the configuration item is not added overwrite option, this option defaults to False, that is, the configuration information of the task, if it has been set, then it will remain unchanged, Even after you modify the cron in your own config file.

The workaround is to add the Overwrite option to your job configuration:

<job:simple id="yourTaskId" class="yourTaskClass" registry-center-ref="regCenter" cron="0 0/30 * * * ?"   sharding-total-count="1" sharding-item-parameters="0=A" overwrite="true"/>

2. I found an example on the Internet, given the serverlists= "yourhost:2181", why the compiler told me serverlists configuration items are not supported

Many examples on the Web are for examples prior to version 1.1.0, 1.1.0 version Elastic-job has made a lot of changes, including some properties.

The solution is to use the 1.1.0 version of the Elastic-job, according to the example of the GitHub website to do (https://github.com/dangdangdotcom/elastic-job)

3. autowired or resource variables in my job are not injected

The workaround is to first see if your variable is static, and if it is static, replace it with non-static, then spring is the problem. Next see if your job has added @component annotations.

4. I am a SPRINGMVC Web project and already have placeholder in other XML files, but the connection shown at Reg:zookeeper initialization is still the style of "${xxx}"

The principle author also makes it clear (follow up on the loading process of the configuration file). If Elastic-job is configured in separate XML, you need to add placeholder to this XML, but you must know that spring loads only one placeholder by default, Then just add ignore-unresolvable= "true" to the placeholder property.

5. What if I need to reset the time of the next trigger in the job?

End the position in your job and add the following code:

JobRegistry.getInstance().getJobScheduleController（jobname）.rescheduleJob(cron)；

If you print the log, you should find that the job is triggered immediately after the above statement is called, and it looks like it was executed two times at a time. This is the reason for triggering point-in-time calculations, where the cron expression is in s, and the execution of the computer is in milliseconds, and it is likely that the current point of time is still the point in time that you have given the new cron expression.

Give a concrete example:

Original cron= "0/10 * * * * *? ”
Trigger Point is: 9:18 10s
Function execution time is: 100ms
Last Call to execute: Reschedule
New cron= "0/5 * * * * *?"
At this point, the trigger point of the new Cron is still met, so reschedule will immediately trigger

This is unavoidable, please ensure the power of your job.

Elastic-job Different side analysis

Here first give Elastic-job one of the main designers Zhang Liang a blog post address, here gives a lot of elastic-job mechanism level of analysis. (http://my.oschina.net/u/719192/blog/506062)

Realizing the idea of comparison

1. First of all, the idea of map/reduce, which seems to have nothing to do with timed tasks, is here because they are two completely different ideas.

It is particularly pointed out here that all compute nodes are passively accepting tasks, and what tasks are given to you by the head node and what tasks you perform.

2. Distributed Scheduled Tasks (Quartz/elastic-job)
Next look at the scheduled task of the cluster scheme, is completely a rollover:

The schedulers on all task execution nodes are running, and they do not perform a task, judging by the data obtained from the Coordination center. Quartz is to look at the database records, Elastic-job is to see zookeeper in the sharding information.

3. Another way of thinking?
Perhaps you would like to think, why the timing task can not be a separate cluster, and then through the management side at any time to upload the timer task jar up, and then the head node scheduling it? In this case, not all the scheduled tasks can be centralized, unified management it? That's good, and the deployed server is independent. I can only say that there is this type of service in the market, because no careful analysis is not exactly the same, here does not give an example. (is not responsible for, do not give is not responsible, eh ~ ~)

Finally, if you know the idea of a distributed scheduling task, it's easy to understand how he deploys:

That is, directly with your Web service, each server instance is a compute node, connected to the Coordination Center (database/zookeeper), when the scheduled task is triggered from the coordination center to query whether you should perform a job task, or directly return, skip the job execution.

Next focus on the different aspects of the source code to do some of the elastic-job of the rough analysis, in order to facilitate the use of Elastic-job problems encountered in the rapid resolution.

Initialization process of a task

The initialized entry is in:

New Jobscheduler (Regcenter, Simplejobconfig, New Simpledistributeonceelasticjoblistener ()). Init ();

Next look at what the Init method does:

The two most important steps are:

1.registerStartUpInfo, this step, this adds to the monitoring of zookeeper (what will be done after listening and listening), and the creation of related nodes on zookeeper.

One of the persistjobconfiguration methods used in the previous problem mentioned in the overwrite, if the overwrite is false, then the Shechule trigger cron expression is directly from the zookeeper obtained, Rather than the local XML configuration.

The Setreshrdingflag is used to create a tag, and the job that runs on all servers with the same name schedule checks for the presence of this tag (as it will be said in subsequent task executions) and, if so, performs a re-sharding of the task (what will be said about Sharding, What shards are used for, this is one of the great highlights of elastic-job superior to quartz clusters).

2.sheduleJob
This is the schedule that created the quartz and started the timed task.

Task execution

Elastic-job task shards, the encapsulation of various types of tasks are here, first look at the relationship between Elastic-job and quartz as elastic-job basis:

Look, it should be clear that elastic-job all the key in the Abstractelasticjob.execute method, the next look:

See the Shardingifnecessary and Getshardingitems.isempty-return in there?

This is the key to how the Shard and control tasks are executed only by the server instance that should be executed.

Sharding

Look at the top so much, it is likely that you are still on the Shard is what, how to use, when will trigger the Shard there are a lot of questions, the next one to see.

When it comes to sharding, it can be said that Elastic-job is an innovation in quartz cluster scheduling. In a quartz clustered environment, only one server instance can run a particular schedule. But under Elastic-job, you can specify a few server instances to perform this task, which can be 1, 2, and 3.

So how many server instances running the same task is not a conflict, at least a waste? NoNoNo .....

For example, in a more extensive environment, such as you have a task of calculating user points on a regular basis, your user table is divided into 10 databases. Then you can use a server instance to run, you can also use 5 server instances to run, because you have 5 servers available, each server instance is divided into 2 database computing tasks.

This scenario with quartz can not be done, with elastic-job, you only need to specify the total score of 10, then each shard specifies a number of tokens, then each server will get 2 tasks to execute.

Of course if you have 5 servers available to calculate, but the database only 2, then you can only divide two pieces, then 5 Taichung has two units to get the right to run, The other servers will see their getshardingitems.isempty in Job.execute and return directly.

What scenario does the Shard trigger?

The following scenarios trigger re-sharding, and the following scenario adds a Reshard token amount to the zookeeper, which fires the Shard the next time the task executes.

The role of the listener

The front said, at the time of initialization elastic-job will register a series of Zookeeper listener, monitor the change of the node, then he specifically monitored where?

There are two ways to summarize: one is to respond to the console's control of timed tasks, and one is to respond to server crashes. When the node being executed crashes, the re-shard is triggered and the execution of the scheduled task is picked up by the other server.

Full solution of timed tasks in Distributed System (III.)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More