Datastage obtains the number of records inserted into the target table by analyzing logs.

Source: Internet
Author: User

Datastage obtains the number of records inserted into the target table by analyzing logs.

This is only a bad method, and there may be better and easier methods. This method requires that the existing log information be deleted before each job is run. Otherwise, the correct number of records cannot be counted. Of course, after the job is run, you can back up the logs of this job to the server disk in shell.

1. Log cleanup settings

Log on to datastage administrator, select the corresponding project, project properties-> record, select "automatically clear job logs", and set to automatically clear previous and previous logs.

Figure 1 administrator log cleanup settings

2. log processing 2.1 log backup (dsjob-logsum)

Use dsjob in shell to call a job. After the job runs, back up the job logs to the disk.

$ Dshome/bin/dsjob-logsum $ projectname $ jobname> $ syslogdir/sharejobname.txt

2.2 record count analysis (grep, awk)

Note that when writing a record to the target table, the keyword "number of rows inserted:" or "number of rows rejected:" is displayed, followed by the number of records. Considering that there may be multiple nodes, you can add up the data of each node.

# Inserted rows

Insertedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows inserted:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum = sum + $1;} end {print sum }''

# Rejected rows

Rejectedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows rejected:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum1 = sum1 + $1;} end {print sum1 }''

Considering that the job may be terminated without entering the number, it should be processed later. For example, if it is null, the value is 0.

If [! -N "$ insertedrows"]; then

Insertedrows = 0

Fi

2.3 other monitoring information

Monitoring may also require job start and end times, end states, and so on, which can be added together and then written into a log file.

The Start Time and end time can be recorded before and after the job is run, and the running package of the job can be obtained using dsjob-run-status.

Jobsta = $ ($ dshome/bin/dsjob-run-mode normal $ jobparameters-Warn 0-jobstatus $ projectname $ jobname 2> & 1 | awk-F = '/^ Status Code /{print $2 }')

2.4 job log records

After the job is executed, write some monitoring information of the job to the log.

Echo $ projectname $ jobname $ jobsta 'date + % Y-% m-% d "" % H: % m: % s' $ starttime $ insertedrows $ rejectedrows> $ logdir/job_run _ 'date + % Y % m % d '. log

3 others (Monitoring-related)

After all the jobs are executed, you can create a job and record it in $ logdir/job_run _ 'date + % Y % m % d '. the data in log is extracted to a table for viewing.

3.1 Table Design

// Log table

Create Table dslog

(

Id integer not null generated by default

As identity (start with 1, increment by 1) primary key,

Prjname varchar (20 ),

Jobname varchar (50 ),

State varchar (20 ),

Rdate date,

Starttime time,

Endtime time,

Insertedrows integer,

Rejectedrows integer

)

// Log status table

Create Table dslogstate

(

State varchar (20 ),

Mark varchar (50 ),

Des varchar (500)

)

3.2 job status Code

Http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.cliapi.ref.doc/topics/r_dsvjbref_Error_Codes.html

Https://www-304.ibm.com/support/docview.wss? Uid = swg21469644

3.3 runjob. Sh source code

#! /Bin/bash

########################################

#

# Runjob. Sh 2012-08-19

# Run a job with Parameters

#

#######################################

 

# If the number of input parameters is less than 2, then output the help document and exit

If [$ #-lt 2]; then

Cat

Runjob -- run a job uasge: runjob projectname jobname jobparameters

Example: runjob dsstage1 dd_test '-Param enddt = 100'

 

Help

 

Exit 0

Fi

 

Projectname = "$1"

Jobname = "$2"

Jobparameters = "$3"

 

# Echo $ projectname

# Echo $ jobname

Echo $ jobparameters

 

# Exit 0

 

Logdir =/DS/dslogs # directory to store logs

Workdate = 'date + % Y % m % d'

 

Syslogdir =/DS/dslogs/syslogsbk/'date + % Y % m % d' # directory to back everyday datastage log. datastage administrator is setted to delete the logs before run a job.

 

# Solve the problem of $ dshome is null

Source/mistel/IBM/informationserver/Server/dsengine/dsenv

 

# Logdir processing. If log folder not exists, create folder.

If [-d $ logdir]; then

Echo "$ logdir is exist, continue ..."

Else

Echo "$ logdir is not exist, creating $ logdir ..."

Mkdir-p $ logdir

Fi

 

# Create datastage logs backup direcotry

If [! -D $ syslogdir]; then

Mkdir-p $ syslogdir

Fi

 

 

 

# Job state processing. If Job state is not finished OK, then reset the job

Jobsta =$ ($ dshome/bin/dsjob-jobinfo $ projectname $ jobname 2> & 1 | awk-F: '/^ job status/{print $2 }')

Echo 'last status: '$ jobsta

If ["$ jobsta" = "Run failed (3)"-o "$ jobsta" = "stopped (97)"]; then

Echo "Reset before run job $ jobname"

$ Dshome/bin/dsjob-run-mode reset $ projectname $ jobname >>$ {logdir}/job_init _ 'date + % Y % m % D'. Log

Sleep 5

Fi

 

# Job start run time

Starttime = 'date + % H: % m: % s'

 

# Run a job

Jobsta = $ ($ dshome/bin/dsjob-run-mode normal $ jobparameters-Warn 0-jobstatus $ projectname $ jobname 2> & 1 | awk-F = '/^ Status Code /{print $2 }')

 

 

# Backup datastage logs

$ Dshome/bin/dsjob-logsum $ projectname $ jobname> $ syslogdir/sharejobname.txt

 

# Calculate the inserted rows and rejected rows from the back up Log File

 

# Inserted rows

Insertedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows inserted:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum = sum + $1;} end {print sum }''

 

# Rejected rows

Rejectedrows = 'cat $ syslogdir/logs jobname.txt | grep "number of rows rejected:" | awk-F: '{print $3}' | SED's /, /''/'| awk' {sum1 = sum1 + $1;} end {print sum1 }''

 

If [! -N "$ insertedrows"]; then

Insertedrows = 0

Fi

 

If [! -N "$ rejectedrows"]; then

Rejectedrows = 0

Fi

 

Echo 'this Run Status Code [1: finished; 2: finished (see log); 3: aborted; 97: stopped]: '$ jobsta

 

# Log

Echo $ projectname $ jobname $ jobsta 'date + % Y-% m-% d "" % H: % m: % s' $ starttime $ insertedrows $ rejectedrows> $ logdir/job_run _ 'date + % Y % m % d '. log

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.