The interactions between the Scheduler and Autoserv
This is basically a timeline of the events that occur for various jobs. This, in addition to the SchedulerSpecification, should paint a pretty clear picture.
monitor_db |
create results directory |
|
if multi-machine, async job: |
|
|
|
monitor_db |
create .machines file in the top level results dir; add machines to it when the job starts |
monitor_db |
write .parse.cmd into each results dir |
|
monitor_db |
write timestamps that the job has started into the keyval file in each results dir |
|
monitor_db |
starts autoserv; 1 instance for sync jobs, n instances for async jobs |
|
if multi-machine, sync job: |
|
|
|
autoserv |
creates .machines file as part of the control file preamble |
autoserv |
does its thang |
|
autoserv |
when status log is updated, it looks for a .parse.cmd; if it exists, run the cmd |
|
autoserv |
finishes...non-0 exit code on abort |
|
monitor_db |
writes a job_finish timestamp into the keyval file in each results directory |
|
monitor_db |
runs parse one last time; output deposited in .parse.log file |
|
Notes
- The .machine file is only generated for multi-machine jobs. The file is generated by the scheduler for async jobs because it needs to append machines to it as they become ready and/or we decide they fulfill the requirements for the job (if the job had meta-hosts). The synchronous multi-machine case is the only one where autoserv is presented a list of more than one machine. The standard autoserv preamble writes out a .machines file for such tests. We can't remember exactly why it's better and/or necessary for autoserv to generate the file in this case...
