Friday, January 3, 2014

Deleting PBS and MAUI Jobs which cannot be purged

If the Compute Node pbs_mom is lost and cannot be recovered (due to hardware or network failure) and to purge a running job from the qstat output or show 1. Shutdown the pbs_server daemon on the PBS Server
# service pbs_server stop
2. Remove Job Spool Files that holds the hanged JobID (For example 4444)
# rm /var/spool/torque/server_priv/jobs/4444.headnode.SC
# rm /var/spool/torque/server_priv/jobs/4444.headnode.JB
3. Start the pbs_Server Daemon
# service pbs_server start
4. Restart the MAUI Daemon
# service maui restart
References:
  1. Deleting PBS/Maui Jobs

No comments: