Quartz.net Distributed Application

Source: Internet
Author: User

Tags: lis auto ken concurrent Obs sleep form cancel serial

Quartz.net cluster deployment in detail

tags (space delimited): Quartz.net Job

Recently work to use job, the company's job some do not meet the personal use, so I want to make a job station practice practiced hand, on-line to see a bit, found quartz, so I learned a bit.

First edition

Currently the individual is using ASP. NET Core, developed under core2.0.
The first version of its own simple write a scheduler.

public static class schedulermanage{private static IScheduler _scheduler = null;        private static Object obj = new Object ();                public static IScheduler Scheduler {get {var Scheduler = _scheduler;                    if (scheduler = = NULL) {//Before this may _scheduler be changed scheduler or the original value                        Lock (obj) {//Here Read the latest memory value to scheduler, guaranteed to read the latest _scheduler                        Scheduler = Volatile.read (ref _scheduler); if (scheduler = = null) {scheduler = Getscheduler ().                            Result;                        Volatile.write (ref _scheduler, Scheduler);            }}} return scheduler;            }} public static async task<baseresponse> runjob (ijobdetail job, Itrigger Trigger) { VaR response = new Baseresponse (); try {var isexist = await scheduler.checkexists (job.                Key);                var time = Datetimeoffset.now; if (isexist) {//recovery already exists task await scheduler.resumejob (job.                Key);                } else {time = await scheduler.schedulejob (job, trigger); } response.                Issuccess = true; Response. MSG = time.            ToString ("Yyyy-mm-dd HH:mm:ss"); } catch (Exception ex) {response. MSG = ex.            Message;        } return response; } public static Async task<baseresponse> stopjob (Jobkey jobkey) {var response = new Basere            Sponse ();                try {var isexist = await scheduler.checkexists (jobkey); if (isexist) {await ScHeduler.                Pausejob (Jobkey); } response.                Issuccess = true; Response. MSG = "Pause succeeded!!"            "; } catch (Exception ex) {response. MSG = ex.            Message;        } return response; } public static Async task<baseresponse> deljob (Jobkey jobkey) {var response = new Baseres            Ponse ();                try {var isexist = await scheduler.checkexists (jobkey); if (isexist) {response.                Issuccess = await scheduler.deletejob (jobkey); }} catch (Exception ex) {response.                Issuccess = false; Response. MSG = ex.            Message;        } return response; } private static Async task<ischeduler> Getscheduler () {NameValueCollection props = new Na Mevaluecollection () {{"quartz.serialIzer.type "," Binary "}};            Stdschedulerfactory factory = new Stdschedulerfactory (props); var scheduler = await factory.            Getscheduler (); Await scheduler.            Start ();        return scheduler; }}

Simple implementation, dynamic running job, suspend job, add job. After finishing, found that seemingly no problem, as long as they run the job information to find a table storage, as if all OK.

When it was time to release, it suddenly found that more than one reality machine, is through the Nigix reverse proxy. The following questions were suddenly discovered:

1, more than one machine is likely to run on multiple machines.
2, when the deployment, you have to stop the machine, how to re-deploy after the machine is stopped to automatically restore the running job.
3, how to run all jobs in a balanced way.

Personal thoughts at the time

1, the first problem: because it is through Nigix reverse proxy, add job and run job can only fall on a server, basically no problem. Personal control Good Runjob interface, run a time, the jobdetail of the table of the running state to run, there is no more than one machine running at the same time.
2, in the case of the first problem solving, due to the logic of our company's Nigix reverse proxy is: equilibrium strategy. So it's okay to run all the jobs evenly.
3, the point is coming!!!
How do I recover a running job at deployment time?

Because we already have a jobdetail table. Which running jobs can be obtained from the inside. Wome let's get him out of here. Just run it right when the program starts.

Here are the personal implementations:

Hostedservice, a service public class hostedservice:ihostedservice{public Hostedservice (Ischedulerjob Sche) that runs while the host is running        Dulercenter) {_schedulerjob = Schedulercenter;        } private Ischedulerjob _schedulerjob = null; Public async Task Startasync (CancellationToken cancellationtoken) {loghelper.writelog ("Open hosted+env:" +e            NV);            var reids= new Redisoperation (); if (Reids.            Setnx ("Redisjoblock", "1")) {await _schedulerjob.startallruningjob (); } reids.        Expire ("Redisjoblock", 300); } public Async Task Stopasync (CancellationToken cancellationtoken) {loghelper.writelog ("end hosted            ");            var Redis = new Redisoperation (); if (Redis. Redisexists ("Redisjoblock")) {var count=redis.                Delkey ("Redisjoblock");            Loghelper.writelog ("Delete reidskey-redisjoblock result:" + count); }       }}//Injected features [Servicedescriptor (typeof (Ischedulerjob), servicelifetime.transient)] public class Schedulerce nter:ischedulerjob {public schedulercenter (Ischedulerjobfacade schedulerjobfacade) {_sched        Ulerjobfacade = Schedulerjobfacade;        } private Ischedulerjobfacade _schedulerjobfacade = null; Public async task<baseresponse> Deljob (Schedulerjobmodel jobmodel) {var response = new Baserespon            SE (); if (Jobmodel! = NULL && jobmodel.jobid! = 0 && Jobmodel.jobname! = null) {RESPO                NSE = await _schedulerjobfacade.modify (new Schedulerjobmodifyrequest () {JobId = jobmodel.jobid, Dataflag = 0}); if (response.                    issuccess) {response = await schedulermanage.deljob (Getjobkey (Jobmodel)); if (!response. issuccess) {response = await _schedulerjobfacade.modiFY (New Schedulerjobmodifyrequest () {JobId = jobmodel.jobid, Dataflag = 1}); }}} else {response.            MSG = "Request parameter is wrong";        } return response; } public Async task<baseresponse> runjob (Schedulerjobmodel jobmodel) {if (Jobmodel! = null                ) {var jobkey = Getjobkey (Jobmodel); var trigglebuilder = Triggerbuilder.create (). Withidentity (jobmodel.jobname + "Trigger", Jobmodel.jobgroup). Withcronschedule (Jobmodel.jobcron).                StartAt (Jobmodel.jobstarttime); if (jobmodel.jobendtime! = NULL && jobmodel.jobendtime! = new DateTime (1900, 1, 1) && Jobmodel.jobendtime                = = new DateTime (1, 1, 1)) {Trigglebuilder.endat (jobmodel.jobendtime);                } trigglebuilder.forjob (Jobkey);                var triggle = Trigglebuilder.build (); var daTa = new Jobdatamap (); Data.                ADD ("* * *", "* * *"); Data.                ADD ("* * *", "* * *"); Data.                ADD ("* * *", "* * *"); var job = jobbuilder.create<schedulerjob> (). Withidentity (Jobkey). Setjobdata (data).                Build ();                var result = await schedulermanage.runjob (job, Triggle); if (result. issuccess) {var response = await _schedulerjobfacade.modify (new Schedulerjobmodifyreque                    St () {JobId = jobmodel.jobid, jobstate = 1}); if (!response.                    issuccess) {await schedulermanage.stopjob (Jobkey);                } return response;                } else {return result; }} else {return new Baseresponse () {Msg = "job name is empty!!            " }; }} public Async task<baseresponse> Stopjob (schedulerjObmodel Jobmodel) {var response = new Baseresponse (); if (Jobmodel! = NULL && jobmodel.jobid! = 0 && Jobmodel.jobname! = null) {RESPO                NSE = await _schedulerjobfacade.modify (new Schedulerjobmodifyrequest () {JobId = jobmodel.jobid, jobstate = 2}); if (response.                    issuccess) {response = await schedulermanage.stopjob (Getjobkey (Jobmodel)); if (!response. issuccess) {response = await _schedulerjobfacade.modify (new Schedulerjobmodifyr                    Equest () {JobId = jobmodel.jobid, jobstate = 2}); }}} else {response.            MSG = "Request parameter is wrong";        } return response; } private Jobkey Getjobkey (Schedulerjobmodel jobmodel) {return new Jobkey ($ "{Jobmodel.jobid}_{jo        Bmodel.jobname} ", Jobmodel.jobgroup);} public Async task<baseresponse> startallruningjob () {try {var Joblistresponse = await _schedulerjobfacade.querylist (new Schedulerjoblistrequest () {Dataflag = 1, JobState = 1, Environ                Ment=kernel.environment.tolower ()});                if (!joblistresponse.issuccess) {return joblistresponse;                } var joblist = Joblistresponse.models;                foreach (var job in joblist) {await runjob (job); } return new Baseresponse () {issuccess = true, MSG = "Start all running jobs successfully when the program starts!!            " };                } catch (Exception ex) {Loghelper.writeexceptionlog (ex); return new Baseresponse () {issuccess = false, MSG = "Startup of all running jobs fails when the program starts!!            " }; }        }    }

When the program starts, put all the jobs to run again, among the multiple runs of the use of Redis distributed lock, now start the lock, do not let others run, in the process of unloading the lock release!! Feel no problem, the main problem is that there may be load balance, all hit a server up, barely able to hit the effect quickly. Of course, the high availability is sacrificed first.

The pits are coming again.

As you know, in a slightly larger company, the operations and development are separate, the company uses the Daoker to deploy, when the program stops, will not call
Hostedservice's Stopasync Method!!
At that time the heart is really 10,000 harmony and harmony Pentium and past!!
Individuals are too lazy to pull these things off with OPS. Finally, the final thing is: Set the expiration time of a Redis distributed lock, presumably estimate the time of a deployment, as long as the deployment is direct, the lock can be on the line, and then the interval of each deployment is greater than the lock expiration time. Good trouble, said more are tears!!

Quartz.net distributed cluster using schedule configuration
        Public async task<ischeduler> Getscheduler () {var properties = new NameValueCollection ();            properties["Quartz.serializer.type"] = "binary";            Storage type properties["Quartz.jobStore.type"] = "Quartz.Impl.AdoJobStore.JobStoreTX, quartz";            Indicates that the prefix properties["quartz.jobStore.tablePrefix"] = "qrtz_";                            Drive type properties["Quartz.jobStore.driverDelegateType"] = "Quartz.Impl.AdoJobStore.SqlServerDelegate, quartz";            Database name properties["Quartz.jobStore.dataSource"] = "scheduljob"; Connection string Data Source = myserveraddress;initial Catalog = myDataBase; User Id = MyUserName;            Password = mypassword; properties["quartz.dataSource.SchedulJob.connectionString"] = "Data Source =."; Initial Catalog = scheduljob; User ID = sa;            Password = Ld309402556; "; SQL Server version (no 20,21 version is available under Core) properties["Quartz.dataSource.SchedulJob.provider"] =" SQL Server ";            Whether the cluster, in cluster mode, is set to True properties["quartz.jobStore.clustered"] = "true";            properties["quartz.scheduler.instanceName"] = "Testscheduler";            Cluster mode set to auto, automatically get the ID of the instance, the cluster must be the ID is not the same, otherwise it will not automatically recover properties["quartz.scheduler.instanceId"] = "Auto";            properties["Quartz.threadPool.type"] = "Quartz.Simpl.SimpleThreadPool, quartz";            properties["Quartz.threadPool.threadCount"] = "25";            properties["quartz.threadPool.threadPriority"] = "Normal";            properties["Quartz.jobStore.misfireThreshold"] = "60000";            properties["quartz.jobStore.useProperties"] = "false";            Ischedulerfactory factory = new Stdschedulerfactory (properties); return await factory.        Getscheduler (); }

The

is then the test code:

        Public async Task Testjob () {var sched = await getscheduler ();            Console.WriteLine ("* * * * * * Deleting existing jobs/triggers"); Sched.            Clear ();            Console.WriteLine ("-------Initialization complete-----------");            Console.WriteLine ("-------Scheduling Jobs------------------"); String schedid = Sched. Schedulername; Sched.            Schedulerinstanceid;            int count = 1; Ijobdetail job = jobbuilder.create<simplerecoveryjob> (). Withidentity ("job_" + count, Schedid)//Put triggers in group named after the cluster node instance just to distinguish ( In logging) What is scheduled from where.                Requestrecovery ()//Ask Scheduler to re-execute this job if it is in progress then the scheduler went down ... .            Build ();                                                          Isimpletrigger trigger = (Isimpletrigger) triggerbuilder.create () . WithidEntity ("Triger_" + count, Schedid). StartAt (Datebuilder.futuredate (1, Intervalunit.second)). Withsimpleschedule (x = x.withrepeatcount (1000). Withinterval (Timespan.fromseconds (5))).            Build (); Console.WriteLine ("{0} would run at: {1} and repeat: {2} times, every {3} seconds", job. Key, Trigger. GETNEXTFIRETIMEUTC (), Trigger. RepeatCount, Trigger.            Repeatinterval.totalseconds); Sched.            Schedulejob (Job, trigger);            count++; Job = Jobbuilder.create<simplerecoveryjob> (). Withidentity ("job_" + count, Schedid)//Put triggers in group named after the cluster node instance just to distinguish ( In logging) What is scheduled from where.           Requestrecovery ()//Ask Scheduler to re-execute this job if it is in progress then the scheduler went down ...     .            Build (); Trigger = (Isimpletrigger) triggerbuilder.create (). Withidentity ("Triger_" + count, Schedid). StartAt (Datebuilder.futuredate (2, Intervalunit.second)). Withsimpleschedule (x = x.withrepeatcount (1000). Withinterval (Timespan.fromseconds (5))).            Build (); Console.WriteLine (String. Format ("{0} would run at: {1} and repeat: {2} times, every {3} seconds", job. Key, Trigger. GETNEXTFIRETIMEUTC (), Trigger. RepeatCount, Trigger.            Repeatinterval.totalseconds)); Sched.            Schedulejob (Job, trigger);            Jobs don ' t start firing until start () has been called ...            Console.WriteLine ("-------starting Scheduler---------------"); Sched.            Start ();            Console.WriteLine ("-------Started Scheduler----------------"); Console.WriteLine ("-------Waiting for one hour ...----------");            Thread.Sleep (timespan.fromhours (1));            Console.WriteLine ("-------shutting down--------------------"); Sched.            Shutdown ();        Console.WriteLine ("-------Shutdown complete----------------"); }

Test add two jobs, every 5s execution.

As you can see in the diagram: Job1 and job2 do not repeat, JOB2 also runs in job1 when I stop Job2.

In this way, the problem of distributed deployment can be realized, and the database structure of quzrtz.net is easy to find and run some.

Capture data graphs for several databases: Basically, some of this information is stored
Jobdetail

Data for triggers

This is the scheduler.

This is a lock.

Next issue:

Description of 1.Job: Stateful job, stateless job.
2.MisFire
3.trigger,cron Introduction
4. The first part of the transformation, their own implementation of a job based on the hostedservice can be distributed scheduling jobs class, in fact, as long as the implementation of this, the other said there is no problem. Row-level locks for tables that are deprecated quartz. Because this concurrency is relatively slow!!

Personal issues

The individual still did not test out the requestrecovery. How to use it!!

Quartz.net Distributed Application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Tags Index: