I have a Linux shell-script that calls
poly-image < script
and which is being dying unexpectedly in a cronjob. The end of the log looks like:
Poly/ML 5.3 Release
Killed
/home/michaeln/stdhol/bin/Holmake: Failed script build for arm_improved_gcScript - exited with code 89 Build failed in directory /home/michaeln/stdhol/examples/machine-code/lisp (exited with code 1)
Holmake is reporting the error returned by a call to OS.Process.system, using
fun posix_diagnostic stat = let open Posix.Process in case fromStatus stat of W_EXITSTATUS w8 => "exited with code "^Word8.toString w8 | W_EXITED => "exited normally" | W_SIGNALED sg => "with signal " ^ SysWord.toString (Posix.Signal.toWord sg) | W_STOPPED sg => "stopped with signal " ^ SysWord.toString (Posix.Signal.toWord sg) end
The presence of "Killed" makes me think that the script is being killed by an external signal, perhaps a SIGSEGV caused by hitting a limit on stack-size. But is this plausible? Would Poly/ML catch that signal and then exit with 89?
Annoyingly, the same script doesn't fail when not in a cronjob, so it's hard to test.
Michael
On 13/05/10 11:01, Michael Norrish wrote:
The presence of "Killed" makes me think that the script is being killed by an external signal, perhaps a SIGSEGV caused by hitting a limit on stack-size. But is this plausible? Would Poly/ML catch that signal and then exit with 89?
Experiments reveal that Poly/ML seems to be exiting with 0x80 + <signalnumber> when it's sent a signal. The documentation for getrlimit says that 89 (SIGKILL) might happen if the hard time limit on the process was being reached. So, my current guess is that my cronjob is being time-limited.
Michael
Michael Norrish wrote:
On 13/05/10 11:01, Michael Norrish wrote:
The presence of "Killed" makes me think that the script is being killed by an external signal, perhaps a SIGSEGV caused by hitting a limit on stack-size. But is this plausible? Would Poly/ML catch that signal and then exit with 89?
Experiments reveal that Poly/ML seems to be exiting with 0x80 + <signalnumber> when it's sent a signal. The documentation for getrlimit says that 89 (SIGKILL) might happen if the hard time limit on the process was being reached. So, my current guess is that my cronjob is being time-limited.
Returning 0x80 + signal seems to be the standard result from a shell. Poly/ML isn't trapping the signal. If I send "cat" a SIGKILL and then run "echo $?" it prints 137 i.e. 0x89. Have you tried using ulimit -t to print/adjust the time-out on the cron job?
David