Stuck cylc
Stuck cylc tasks
Testing of the RAS revealed an intermittent problem: sometimes tasks in u-bu503 remain stuck in a submitted state within the Cylc GUI.
Using qstat revealed that they had failed, but this was not correctly reflected in the GUI.
To test this error, run:
cat ~/cylc-run/u-bu503/log/job/1/<task_name>/01/job.err
and you should get an output similar to the following:
/local/spool/pbs/mom_priv/jobs/140074859.gadi-pbs.SC: line 104: /g/data/hr22/apps/cylc7/cylc_7.9.9/lib/cylc/job.sh: No such file or directory
/local/spool/pbs/mom_priv/jobs/140074859.gadi-pbs.SC: line 105: cylc__job__main: command not found
The workaround for this is to use the Cylc GUI to:
- Set the task state to failed.
- Set the task state to waiting.
- Check that the task then automatically goes into submitted, then running, then succeeded.
This is an intermittent, and often unreproducible error, hence the task should succeed when resubmitted. This issue has been reported to NCI.