Brief introduction
Batch processing is an important aspect of a business system that is used in a billing system or report generation, and in the settlement system at the end of the day. As business systems are used around the world Day and night, batch windows are becoming narrower, making efficient batch systems a real requirement. WebSphere Extended Deployment Compute Grid (hereinafter referred to as Compute grid) is a complete, out-of-the-box batch platform that delivers an efficient, reliable, scalable, highly available and secure batch execution environment.
This article is based on the WebSphere Compute Grid V8. We construct a simple transactional batch application using the batch job development feature of the Rational application Developer V8. It then modifies it to include the parallel Job Manager tool. This article details the step-by-step process for developing a batch application from scratch and how to use the Parallel job Manager (hereinafter referred to PJM) tool provided by Compute Grid to achieve parallelism in a batch job.
About this tutorial
The sample batch application name is EmployeeBatchV8, which obtains employee data from the employee table, performs certain processing, and then inserts the updated information into the Employeeoutput table. We will have about 10,000 employee input records from people living in different states in the U.S., with state abbreviations ranging from AL to WY. Using the Parallel job Manager facility, the primary jobs are split into different secondary jobs (AL-MO and Mt-wy) and processed independently. We will overwrite the Parameterizer system programming Interface (SPI) to provide a set of independent inputs to each secondary job so that they can run in parallel in different grid endpoints (GEE) in a clustered environment. See Figure 1.
The server JVM of a managed batch application is called a grid endpoint.
Figure 1. Application Overview
Goal
In this tutorial, you will learn how to
Develop batch applications using the Rational application Developer 8 and Compute Grid APIs
Parallel Job Manager tool using Compute Grid
Deploy applications on the WebSphere application Server network deployment cluster and monitor jobs
To create a batch application from the POJO class using the Wsbatchpackager utility
Prerequisite conditions
You must be familiar with the application deploymen in using the Eclipse-based IDE and WebSphere Application Server network Deployment (hereinafter referred to as Application server) T develop Java applications.
System Requirements
To run the examples in this tutorial, you need to have WebSphere application Server v7.0.0.l7 or later (preferably ND) and WebSphere Compute Grid 8.0.0.1 installed in any supported environment. See Figure 2. The environment used in this tutorial is set to:
Windows XP Machine
Installed WebSphere application Server 7.0.0.19, Compute Grid 8.0.0.1
Created the network deployment Manager configuration file.
Configuration file name: Dmgr01
Node Name: ${shorthostname}cellmanager01
Server: Dmgr
A managed node configuration file was created. (Configuration file name: APPSRV01)
Chaining managed nodes into the network deployment (ND) unit
Created cluster (cluster Name: Cgcluster), it has 2 servers (Server1, server2)
Another server named Schedulerclone was created as a scheduler in the same unit
Installs the DB2 UDB V9.7 and creates an Employee database. Run the DDL files provided in the download form in this tutorial to create the Employee and Employeeoutput tables
Migrate the Compute Grid data source from the default Derby database to the DB2
In the application Server console, configure a DataSource to point to Employee Database. (JNDI name: JDBC/EMPLOYEEDBXA)
Install the Rational application Developer 8.0.2 or the latest version and open the Compute Grid Tools for modern Batch feature
Figure 2. Sample infrastructure Diagram