Datastage obtains the number of records inserted into the target table by analyzing logs.
This is only a bad method, and there may be better and easier methods. This method requires that the existing log information be deleted before each job is run. Otherwise, the correct number of records cannot be counted. Of course, after the job is run, you can back up the logs of this job to the server disk in shell.
1. Log cleanup settings
Log on to datastage administrator, select the corresponding project, project properties-> record, select "automatically clear job logs", and set to automatically clear previous and previous logs.
Figure 1 administrator log cleanup settings
2. log processing 2.1 log backup (dsjob-logsum)
Use dsjob in shell to call a job. After the job runs, back up the job logs to the disk.
$ Dshome/bin/dsjob-logsum $ projectname $ jobname> $ syslogdir/sharejobname.txt
2.2 record count analysis (grep, awk)
Note that when writing a record to the target table, the keyword "number of rows inserted:" or "number of rows rejected:" is displayed, followed by the number of records. Considering that there may be multiple nodes, you can add up the data of each node.
# Inserted rows
Insertedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows inserted:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum = sum + $1;} end {print sum }''
# Rejected rows
Rejectedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows rejected:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum1 = sum1 + $1;} end {print sum1 }''
Considering that the job may be terminated without entering the number, it should be processed later. For example, if it is null, the value is 0.
If [! -N "$ insertedrows"]; then
Insertedrows = 0
Fi
2.3 other monitoring information
Monitoring may also require job start and end times, end states, and so on, which can be added together and then written into a log file.
The Start Time and end time can be recorded before and after the job is run, and the running package of the job can be obtained using dsjob-run-status.
Jobsta = $ ($ dshome/bin/dsjob-run-mode normal $ jobparameters-Warn 0-jobstatus $ projectname $ jobname 2> & 1 | awk-F = '/^ Status Code /{print $2 }')
2.4 job log records
After the job is executed, write some monitoring information of the job to the log.
Echo $ projectname $ jobname $ jobsta 'date + % Y-% m-% d "" % H: % m: % s' $ starttime $ insertedrows $ rejectedrows> $ logdir/job_run _ 'date + % Y % m % d '. log
3 others (Monitoring-related)
After all the jobs are executed, you can create a job and record it in $ logdir/job_run _ 'date + % Y % m % d '. the data in log is extracted to a table for viewing.
3.1 Table Design
// Log table
Create Table dslog
(
Id integer not null generated by default
As identity (start with 1, increment by 1) primary key,
Prjname varchar (20 ),
Jobname varchar (50 ),
State varchar (20 ),
Rdate date,
Starttime time,
Endtime time,
Insertedrows integer,
Rejectedrows integer
)
// Log status table
Create Table dslogstate
(
State varchar (20 ),
Mark varchar (50 ),
Des varchar (500)
)
3.2 job status Code
Http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.cliapi.ref.doc/topics/r_dsvjbref_Error_Codes.html
Https://www-304.ibm.com/support/docview.wss? Uid = swg21469644
3.3 runjob. Sh source code
#! /Bin/bash
########################################
#
# Runjob. Sh 2012-08-19
# Run a job with Parameters
#
#######################################
# If the number of input parameters is less than 2, then output the help document and exit
If [$ #-lt 2]; then
Cat
Runjob -- run a job uasge: runjob projectname jobname jobparameters
Example: runjob dsstage1 dd_test '-Param enddt = 100'
Help
Exit 0
Fi
Projectname = "$1"
Jobname = "$2"
Jobparameters = "$3"
# Echo $ projectname
# Echo $ jobname
Echo $ jobparameters
# Exit 0
Logdir =/DS/dslogs # directory to store logs
Workdate = 'date + % Y % m % d'
Syslogdir =/DS/dslogs/syslogsbk/'date + % Y % m % d' # directory to back everyday datastage log. datastage administrator is setted to delete the logs before run a job.
# Solve the problem of $ dshome is null
Source/mistel/IBM/informationserver/Server/dsengine/dsenv
# Logdir processing. If log folder not exists, create folder.
If [-d $ logdir]; then
Echo "$ logdir is exist, continue ..."
Else
Echo "$ logdir is not exist, creating $ logdir ..."
Mkdir-p $ logdir
Fi
# Create datastage logs backup direcotry
If [! -D $ syslogdir]; then
Mkdir-p $ syslogdir
Fi
# Job state processing. If Job state is not finished OK, then reset the job
Jobsta =$ ($ dshome/bin/dsjob-jobinfo $ projectname $ jobname 2> & 1 | awk-F: '/^ job status/{print $2 }')
Echo 'last status: '$ jobsta
If ["$ jobsta" = "Run failed (3)"-o "$ jobsta" = "stopped (97)"]; then
Echo "Reset before run job $ jobname"
$ Dshome/bin/dsjob-run-mode reset $ projectname $ jobname >>$ {logdir}/job_init _ 'date + % Y % m % D'. Log
Sleep 5
Fi
# Job start run time
Starttime = 'date + % H: % m: % s'
# Run a job
Jobsta = $ ($ dshome/bin/dsjob-run-mode normal $ jobparameters-Warn 0-jobstatus $ projectname $ jobname 2> & 1 | awk-F = '/^ Status Code /{print $2 }')
# Backup datastage logs
$ Dshome/bin/dsjob-logsum $ projectname $ jobname> $ syslogdir/sharejobname.txt
# Calculate the inserted rows and rejected rows from the back up Log File
# Inserted rows
Insertedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows inserted:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum = sum + $1;} end {print sum }''
# Rejected rows
Rejectedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows rejected:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum1 = sum1 + $1;} end {print sum1 }''
If [! -N "$ insertedrows"]; then
Insertedrows = 0
Fi
If [! -N "$ rejectedrows"]; then
Rejectedrows = 0
Fi
Echo 'this Run Status Code [1: finished; 2: finished (see log); 3: aborted; 97: stopped]: '$ jobsta
# Log
Echo $ projectname $ jobname $ jobsta 'date + % Y-% m-% d "" % H: % m: % s' $ starttime $ insertedrows $ rejectedrows> $ logdir/job_run _ 'date + % Y % m % d '. log