Sample of spring batch (CSV file operation) (4)

Source: Internet
Author: User

This article will use a complete example to discuss the use of spring batch to read and write CSV files. The process of this instance is: Read a CSV file (ID, name, age, score) containing four fields and perform simple processing on the read fields, then output to another CSV file.

The engineering structure is as follows:

The joblunch class is used to start a job, And the csvitemprocessor class is used to process the data obtained by reader. The student class is a pojo class used to store ing data. Inputfile.csv is the Data Reading file, and outputfile.csv is the data output file.

Configuration of the application. xml file is shown in the previous article.Article.

The job configuration in the batch. xml file is as follows:

 <  Job  ID  = "Csvjob" > 
< Step ID = "Csvstep" >
< Tasklet Transaction-Manager = "Transactionmanager" >
< Chunk Reader = "Csvitemreader" Writer = "Csvitemwriter" Processor = "Csvitemprocessor" Commit-interval = "1" >
</ Chunk >
</ Tasklet >
</ Step >
</ Job >

This file configures the running job: csvjob. This job contains a step to complete reading and writing CSV files. Csvitemreader reads CSV files, csvitemprocessor processes the acquired data, and csvitemwriter writes CSV files.

The configuration of csvitemreader in the batch. xml file is as follows:

 <! --  Read CSV files  --> 
< BEAN: Bean ID = "Csvitemreader"
Class = "Org. springframework. batch. item. file. flatfileitemreader" Scope = "Step" >
< BEAN: Property Name = "Resource" Value = "Classpath: inputfile.csv" />
< BEAN: Property Name = "Linemapper" >
< BEAN: Bean
Class = "Org. springframework. batch. item. file. Mapping. defaultlinemapper" >
< BEAN: Property Name = "Linetokenizer" Ref = "Linetokenizer" />
< BEAN: Property Name = "Fieldsetmapper" >
< BEAN: Bean
Class = "Org. springframework. batch. item. file. Mapping. beanwrapperfieldsetmapper" >
< BEAN: Property Name = "Prototypebeanname" Value = "Student" > </ BEAN: Property >
</ BEAN: Bean >
</ BEAN: Property >
</ BEAN: Bean >
</ BEAN: Property >
</ BEAN: Bean >

< BEAN: Bean ID = "Student" Class = "Com.wanggc.springbatch.sample.csv. Student" > </ BEAN: Bean >

<! -- Linetokenizer -->
< BEAN: Bean ID = "Linetokenizer" Class = "Org. springframework. batch. item. file. Transform. delimitedlinetokenizer" >
< BEAN: Property Name = "Delimiter" Value = "," />
< BEAN: Property Name = "Names" >
< BEAN: List >
< BEAN: Value > ID </ BEAN: Value >
< BEAN: Value > Name </ BEAN: Value >
< BEAN: Value > Age </ BEAN: Value >
< BEAN: Value > Score </ BEAN: Value >
</ BEAN: List >
</ BEAN: Property >
</ BEAN: Bean >

Csvitemreader implements the flatfileitemreader class provided by spring batch, which is mainly used for read operations on flat files. It contains two necessary attributes: Resource and linemapper. The former specifies the location of the file to be read, and the latter maps each row of the file into a pojo object. Among them, linemapper also has two important attributes: linetokenizer and fieldsetmapper. linetokenizer Splits a row of the file into a fieldset, and then maps fieldsetmapper to a pojo object.

This method is similar to DB read operations. Linemapper is similar to resultset. A row in a file is similar to a record in a table and is encapsulated into fieldset, similar to rowmapper. As for how to encapsulate a record, this task is completed by the linetokenizer inheritance class delimitedlinetokenizer. The delimitedlinetokenizer's delimiter attribute determines the data of a row in the file. The default value is ",". The names attribute indicates the name of each decomposed field and passes it to fieldsetmapper (this instance uses beanwrapperfieldsetmapper) you can obtain the corresponding value according to the name. The attribute prototypebeanname of fieldsetmapper is the name of the poing pojo class. After this attribute is set, the framework maps a fieldset decomposed by linetokenizer to a pojo object, the ing is completed by name (the name marked during linetokenizer decomposition corresponds to the name of the field in the pojo object ).

In short, flatfileitemreader reads a record in the following four steps: 1. Read a record from the file specified by resource; 2. linetokenizer splits the record into fileset according to delimiter, the name of each field is obtained by the names attribute. 3. The split fileset is passed to fieldsetmapper and mapped to a pojo object by name. 4, finally, flatfileitemreader returns the pojo object mapped to. The framework passes the returned object to processor.

Csvitemprocessor implements the itemprocessor class. This class accepts the pojo object mapped by reader, which can be processed by the corresponding business logic, and then returned, the framework will pass the returned results to the writer for write operations. ImplementationCodeAs follows:

Package Com.wanggc.springbatch.sample.csv;

Import Org. springframework. batch. item. itemprocessor;
Import Org. springframework. stereotype. component;

/**
* Itemprocessor class.
*/
@ Component ("csvitemprocessor ")
Public Class Csvitemprocessor Implements Itemprocessor <student, student> {

/**
* Perform simple processing on the obtained data.
*
* @ Param Student
* Data before processing.
* @ Return The processed data.
* @ Exception Exception
* Handle any exceptions.
*/
@ Override
Public Student process (Student) Throws Exception {
/* Merge ID and name */
Student. setname (student. GETID () + "--" + student. getname ());
/* Age plus 2 */
Student. setage (student. getage () + 2 );
/* Score plus 10 */
Student. setscore (student. getscore () + 10 );
/* Pass the processed result to writer */
Return Student;
}
}

The configuration of csvitemreader in the batch. xml file is as follows:

 <! --  Write CSV files  --> 
< BEAN: Bean ID = "Csvitemwriter"
Class = "Org. springframework. batch. item. file. flatfileitemwriter" Scope = "Step" >
< BEAN: Property Name = "Resource" Value = "File: src/outputfile.csv" />
< BEAN: Property Name = "Lineaggregator" >
< BEAN: Bean
Class = "Org. springframework. batch. item. file. Transform. delimitedlineaggregator" >
< BEAN: Property Name = "Delimiter" Value = "," > </ BEAN: Property >
< BEAN: Property Name = "Fieldextractor" >
< BEAN: Bean
Class = "Org. springframework. batch. item. file. Transform. beanwrapperfieldextractor" >
< BEAN: Property Name = "Names" Value = "Name, age, score" > </ BEAN: Property >
</ BEAN: Bean >
</ BEAN: Property >
</ BEAN: Bean >
</ BEAN: Property >
</ BEAN: Bean >

Csvitemwriter implements the flatfileitemwriter class. Similar to the flatfileitemreader class, this class also has two important attributes: Resource and lineaggregator. The former is the path of the file to be output, and the latter is similar to linetokenizer. Lineaggregator (the delimitedlineaggregator class is used in this instance) also has two important attributes: delimiter and fieldextractor. Delimiter indicates the division of the output field. The latter assembles the pojo object into a string consisting of the fields of the pojo object. Similarly, flatfileitemwriter writes a record in the following four steps: 1. processor transfers an object to lineaggregator; 2. lineaggregator converts the object into an array. 3, then, the lineaggregator attribute fieldextractor converts the array into a string separated by delimiter. 4, this string is output.

In this way, the reading, processing, and writing operations of a piece of data are basically completed. Of course, read and write can also be processed by writing classes by yourself, but you only need to inherit flatfileitemreader and flatfileitemwriter.

The student class code used in the instance is as follows:

 Package Com.wanggc.springbatch.sample.csv;

/** Pojo class _ student */
Public Class Student {
/** ID */
Private String id = "";
/** Name */
Private String name = "";
/** Age */
Private Int Age = 0;
/** Score */
Private Float Score = 0;
/* Getter and setter have been deleted */
}

The input data used in the instance is as follows:

The instance output result is as follows:

Note the following two points for configuration in this article:

1. Note that the writer resource must be written in the form of "file: ******", and cannot be in the form of "classpath.

2. If the Commit-interval attribute in the job configuration is set to greater than 1, each time the commit is the last record, the previous read is overwritten. The specific cause is unknown. If you rewrite the fieldsetmapper attribute of reader, you can solve this problem.. (Note: adding the scope attribute of student bean solves this problem: Scope: "prototype". 2011/12/16)

Next time, I will discuss with you about reading and writing XML files.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.