Spring Batch Learning (ii) architecture

Source: Internet
Author: User
Tags to domain

Schema for Spring Batch

A batch job is a collection of ordered steps that are executed as part of a predefined process;

Step represents a custom unit of work, which is the main component block of the job; each step consists of three parts: Itemreader, Itemprocessor, Itemwriter; These three sections will be executed on each record being processed, Itemreader reads each record and passes it to itemprocessor processing, and finally to itemwriter for persistence; Itemprocessor is not required, a step can contain only Itemreader and Itemwriter If you don't need to read or write any data, you can just include a tasklet in a step (equivalent to itemprocessor);

Some of the related classes and interfaces that make up spring batch:

    1. Org.springframework.batch.core.Job: Represents a job, but also provides the ability to execute the job;
    2. Org.springframework.batch.core.Step: Represents a step, but also provides the ability to execute step;
    3. Org.springframework.batch.item.itemreader<t>: Provides the ability to read data;
    4. Org.springframework.batch.item.itemprocessor<t>: We can use it to apply business logic to every piece of data to be processed;
    5. Org.springframework.batch.item.itemwriter<t>: Provides the ability to write data

The advantage of Spring batch building a job in this way is to decouple each step into its own independent processor; Each step is responsible for getting the data, applying the business logic to the data, and writing the data to the appropriate location;

A tasklet is a special step type that uses it to perform a function without Itemreader and itemwriter, and Tasklet can only be used as a single function, such as performing some initialization, invoking a stored procedure, Send an email notification that the job is complete.

Run Job

First look at the following diagram, describing the job's various components and their relationships:

You will find jobrepository this component will be associated with several other components, which represent a data store (memory or an external database), Information used to persist the job or step execution (expressed in jobexecution and stepexecution);

The job starts execution through Joblauncher, Joblauncher checks the jobrepository to verify that the job has been run before, and verifies the parameters passed in to the job, and finally executes the job;

The job execution process is very similar to step, and the job first executes each of the steps it contains, and when the data is processed, it updates the results to Jobrepository's jobexecution and stepexecution , step first reads each data item it wants to process through itemreader, stepprpcessor processing, and updates stepexecution data in Jobrepository. Some information, such as commit times, start end time, is stored in jobrepository, and when a job/step is complete, the relevant execution information in Jobrepository is updated to the final state.

Parallel operations

In spring batch, parallelism can be implemented in the following four ways:

  • Step Multithreading: In spring batch, we put the job is configured to deal with the work block called chunk, each chunk is processed, will execute a commit, these chunk executed sequentially, if there are 10,000 records, one processing 50 pieces, The job will commit after 1 to 50 records are completed, then commit again after 51 to 100 records are completed, and if we open 3 threads in step, we will increase the processing power by 3 times times:
  • Parallel execution Step: Suppose we have two step, each of which is responsible for loading the data of an input file into the database, there is no mutual dependency between the two step, we can let these two step parallel execution:
  • Remote chunking: The first two methods are handled within a JVM, which allows you to extend your processing across multiple JVM instances, one of which is the primary node, which reads the input data through a itemreader, Then the data is sent over the network to other JVM instances (called slave nodes) for processing, after processing is completed, from the node and the results of processing sent back to the main node, the main node through the Itemwriter output;
  • Partitioning: This method does not need to span multiple JVM instances, so it does not require network data transfer, but still uses master-slave configuration, which means that a step as the main step, it acts as a plurality of other controllers from step; it reads the input data through a itemreader, It is then passed to the process from step, and the result of processing is passed back to the main step after the process is completed:
Instance Job

Spring Batch provides a number of simple job instances for you to reference when developing your custom batch application:

  • Adhocloopjob: Demonstrates an infinite loop job that exposes elements through JMX;
  • Beanwrappermappersamplejob: Demonstrates how to implement validation of file-based input data and map file fields to domain objects;
  • Compositeitemwritersamplejob: A step can contain only one itemreader and Itemwriter, this job teaches you how to circumvent this restriction;
  • Customerfilterjob: Demonstrates how to use a itemprocessor filter for invalid customer;
  • Delegatingjob: Using Itemreaderadapter, the reading behavior of input data is delegated to a Pojo method;
  • Footballjob: A football match statistics job, after loading two input files (a file is athlete data, a file is the tournament data), generate a statistic and output to the log file;
  • Groovyjob: Demonstrates running a file compression and decompression script written by groovy;
  • Headerfootersample: Demonstrates how to use callbacks to add headers and footer to the output;
  • Hibernatejob:spring Batch Reader and writer do not use Hibernate by default, and this job demonstrates how to integrate hibernate;
  • Infiniteloopjob: An infinite loop job that restarts automatically after the job is stopped;
  • Iosamplejob: Provides examples of many different IO methods, such as reading delimiter files, fixed-length fields of files, Xml, JDBC, ibatis integration;
  • Jobsamplejob: Demonstrates how to perform another job from one job;
  • Loopflowsample: Demonstrates how to programmatically control the execution process;
  • Mailjob: Demonstrates how to use simplemailmessageitemwriter to send email;

Spring Batch Learning (ii) architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.