Set a Watch for Jobs That Hang Others
December 5, 2019
Jobstreams deliver on the HP 3000's other promise. When the server was introduced in the early 1970s it promised interactive computing, well beyond the powers of batch processing. Excellent, said the market. But we want the batch power, too. Running jobs delivered on the promise that a 3000 could replace lots of mainframes.
Decades later, job management is still crucial to a 3000's success. Some jobs get hung for one reason or another, and the rest of the system processing is halted until someone discovers the problem job and aborts it. When it happens over a weekend, it's worse. You can come in Monday and see the processing waiting in queue for that hung-up job to finish.
Is there a utility that monitors job run time, so that it can auto-abort such jobs after X number of hours? Nobix sells JobRescue, a commercial product for "automatically detecting errors and exception messages; JobRescue eliminates the need for manual review of $STDLISTs, making batch processing operations more productive."
Then there's Design 3000 Plus. The vendor still has a working webpage that touts JMS/3000, a job management system that was at one time deployed at hundreds of sites. Its powers include "automatic job restart and recovery. Whenever a job fails, a recovery job can be initiated immediately."
The home-grown solutions are just waiting out there, though, considering how few 3000 sites have a budget for such superior software. Mark Ranft of Pro3K shared his job to check on jobs. The system does a self-exam and reports a problem.
Ranft said, "Here is an example where I am concerned that the weekly SLTBACK job doesn't complete. I stream a second job for 6 hours out to check and complain if SLTBACK is still running."
!job sltback,manager.sys,job;outclass=lp,1,1
!
!continue
!stream sltchk.job.sys;in=,6
!
!if jobcnt('sltback,manager.sys') > 1 then
! continue
! mail.exe "-t [email protected] &
! -f ![email protected] &
! -h !my_mailhost &
! -s '!osnode sltback already running!!' &
! sltback job already running !hpdatef at !hptimef"
! eoj
!endif
!
!loadtap7
!pause 60
!
!run autorep.exe;parm=7
!file sysgtape;dev=7
!showdev 7
!
!sysgen
tape verbose store=^sltbk1
exit
!
!tellop ---- job sltback is done
!tell manager.sys ---- job sltback is done
!stream sltback.job;day=SUNDAY;at=02:30
!eoj
SLTBK1 contains....
@.@.@ <-- or whatever filesets you desire
;show;onvs=mpexl_system_volume_set,client_vs;
progress;maxtapebuf;compress=high;online=start;
partialdb;directory;statistics
!JOB SLTCHK,manager.sys;OUTCLASS=LP,1
!# *-----------------------------------------------*
!# * THIS JOB WILL VERIFY THAT SLTBACK HAS *
!# * COMPLETED SUCCESSFULY. THIS JOB SHOULD BE *
!# * STREAMED TO RUN 6 HOURS AFTER THE SLTBACK *
!# * JOB HAS STARTED. *
!# *-----------------------------------------------*
!
!# *-----------------------------------------------*
!# * Validate SLTBACK.JOB.SYS has completed *
!# *-----------------------------------------------*
!
!SETVAR CIERROR 0
!RUN MAIN.PUB.VESOFT;PARM=1;INFO= &
! "SHOWOUT @[email protected](SPOOL.JSNAME MATCHES 'SLTBACK' AND &
! SPOOL.ISOPENED)"
!
!IF CIERROR = 0
!
! mail.exe "-t [email protected] &
! -f ![email protected] &
! -h !my_mailhost &
! -s ' **** !osnode SLTBACK job problem ****' &
! SLTBACK is still running!"
!
!abortio 7
!pause 2
!abortio 7
!PAUSE 2
!ABORTJOB SLTBACK,manager.sys
!
!endif
!
!eoj
Photo by AK¥N Cakiner on Unsplash