Last modified: 2014-08-27 17:26:50 UTC
When the client script is killed with -9, the wrapper script terminates as well. While this is probably a prudent choice, the user isn't informed about this at all. As a minimal courtesy, we should add "-m ae" to the qsub call, so that the user gets at least a mail that he probably doesn't understand :-). Of course, even better would be to use a SGE hook that fires after a job terminates.
Users may request -m from jsub/jstart which are passed to qsub and behave as expected. I'd rather not increase the default amount of cron/gridengin spam.
My suggestion would send one (1) message if a job terminates that the user expects to be running continuously.
Wouldn't the default "Hey, I had to restart your job" from bigbrother fill that function?
That requires the job being managed by bigbrother. I just wanted to point out that IMHO one message per interaction is not spam, especially if it conveys important information for the recipient.