The schedMain package is a general scheduler designed for tasks running on a Linux or similar operating system.
SchedMain maintains a work list of tasks to be executed. Each item in the work list represents a specific task to be run in a specific directory. When schedMain starts it reads an initial work list from the file initWork. Each line of initWork contains the name of a task and the directory to run in. For example initWork might contain:
alpha.py aaDir
This represents the task alpha.py to be run in directory aaDir.
Using the above example of a work list containing alpha.py with directory aaDir ...
Before task alpha.py can run in aaDir, schedMain checks for a file aaDir/alpha.preWork. If the preWork file exists it contains a list of tasks that must complete successfully before alpha.py can start in aaDir. The preWork file has the same format as the initWork file.
After the prerequisites complete schedMain starts alpha.py. Eventually alpha.py signals its completion by writing a tiny file aaDir/alpha.status.ok, indicating OK completion. Every task must write either a x.status.ok or x.status.error file on completion, to let schedMain know it’s done. If a task completes without writing either file, schedMain assumes it ended badly and will write an x.status.error for it.
After task alpha.py completes schedMain removes it from the work list. If the task was successful (wrote aaDir/alpha.status.ok), schedMain checks for a file aaDir/alpha.postOkWork. If the postOkWork file exists, it contains a list new tasks to be added to schedMain’s work list. The postOkWork file has the same format as initWork.
Now would be a good time to check out the static example, schedMain Example A: static files,
See schedMain.aaOverview().
This is the main scheduling system. It reads the file initWork to create an initial work list. Then it repeatedly executes tasks that have satisfied their prerequisites, as those tasks add future tasks to the work list.
For example, suppose the initWork file contains the line:
alpha.sh some/directory
Then schedMain will look in some/directory for the file alpha.preWork. If present it contains prerequisites for alpha.sh.
After all the prerequisites have completed with status OK, schedMain will look for the script alpha.sh, first in some/directory, then in the global script area, globalDir/cmd.
Then schedMain will run the script alpha.sh in some/directory. If the work list specifies a script ending in ”.pbs”, like alpha.pbs, instead of running the scritp schedMain will issue qsub alpha.pbs.
When a task alpha.sh completes typically it writes two files:
alpha.postOkWork # list of tasks to start next
alpha.status.ok # indicates OK completion
After detecting that the script wrote file alpha.status.ok, schedMain adds any tasks listed in alpha.postOkWork to the work list.
The lines in file initWork, like all work lists (alpha.preWork, alpha.postOkWork, alpha.postErrorWork), have the format:
scriptName dirPath # scriptName may be *.sh, *.py, or *.pbs
For example:
alpha.sh some/directory
The sname is the base name of the scrip - for example, if we strip the suffix (”.sh” or ”.py” or ”.pbs”) off alpha.sh we get the base sname, in this case alpha.
The sname is the root of a handful of files related to that script in that dirPath. Using the alpha example:
alpha.preWork # Work list: prerequisites for this task
alpha.postOkWork # Work list: tasks to start after this task
# writes file status.ok
alpha.postErrorWork # (rarely used) Work list: tasks to start after
# this task writes file status.error
alpha.status.init # time this task was initialized
alpha.status.submit # time this task was submitted (like qsub)
alpha.status.run # time this task was started running
alpha.status.ok # time this task finished successfully
alpha.status.error # time and error message if finished unssuccessfully
Sometimes some of the status files are omitted, but either status.ok or status.error MUST be written.
To write a status file from a shell script alpha.sh use:
date '+%Y-%m-%d_%H:%M:%S' > alpha.status.ok
# Or:
date '+%Y-%m-%d_%H:%M:%S' > alpha.status.error
echo 'Some error message' >> alpha.status.error
echo 'More error message' >> alpha.status.error
To write a status file from a python script alpha.py use:
import schedMisc
schedMisc.setStatusOk( 'alpha', os.getcwd())
# Or:
import schedMisc
schedMisc.setStatusError( 'alpha', os.getcwd(),
'Some error message.\nMore error message.')
# Or:
import datetime
dateFmt = '%Y-%m-%d_%H:%M:%S'
dateStg = datetime.datetime.now().strftime( dateFmt)
fpath = os.path.join( os.getcwd(), 'alpha.status.ok')
with open( fpath, 'w') as fout:
print >> fout, dateStg
# Or:
fpath = os.path.join( os.getcwd(), 'alpha.status.error')
with open( fpath, 'w') as fout:
print >> fout, dateStg
print >> fout, 'Some error message'
print >> fout, 'More error message'
Prints an error message and usage info, and exits with rc 1.
For an overview, see aaOverview().
Command line parameters are:
-bugLev <string> Debug level. Typically 0, 1, or 5. -hostType <string> System type: hostLocal or peregrine or ... -globalDir <string> Dir containing global info, including subdir cmd -ancDir <string> An ancestor dir of all dirs to be processed -initWork <string> File containing the initial work list -delaySec <string> Schedule loop delay, seconds -redoAll <bool> n/y: on restart, redo all even if prior run was ok -useReadOnly <bool> n/y: only print status; do not start tasks