Both this (with Posix.Process open):
case fork() of NONE => (exece(proc,proc::args,env') handle _ => die ("Exece of "^proc^" failed")) | SOME cpid => let val (_, result) = waitpid(W_CHILD cpid, []) in result end
and this:
case fork() of NONE => let val () = dup2 {old = outoutfd, new = Posix.FileSys.stdout} val () = dup2 {old = erroutfd, new = Posix.FileSys.stderr} val () = dup2 {old = ininfd, new = Posix.FileSys.stdin} val () = List.app close [errinfd, erroutfd, outinfd, outoutfd, ininfd, inoutfd] in exece(executable,nm_args,env) end | SOME pid => let val out = mkTIO_instream outinfd val err = mkTIO_instream errinfd val () = List.app close [outoutfd, erroutfd, ininfd, inoutfd] in ? end
seem to be causing us grief, particularly on a recent Linux, PolyML 5.7 and under conditions of some memory load.
The Linux is:
Linux gemma 4.9.1-gnu-1 #1 SMP PREEMPT Sun Jan 8 16:17:57 UYT 2017 x86_64 GNU/Linux
When strace-d, we get a lot of futex calls and interruptions from SIGCHLD.
This is with --gcthreads=1, but pages like
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-usin...
make me a bit nervous about how we might be falling prey to horrible problems.
Is there a better idiom we might use?
Thanks, Michael
On 05/06/17 10:35, Michael.Norrish at data61.csiro.au wrote:
Both this (with Posix.Process open):
case fork() of NONE => (exece(proc,proc::args,env')
and this:
case fork() of exece(executable,nm_args,env)
seem to be causing us grief, particularly on a recent Linux, PolyML 5.7 and under conditions of some memory load.
Is there a better idiom we might use?
For Isabelle, I have made many variations and refinements in the past 10 years -- without fork/exec due to portability problems. The present state in this quest for perfection is here (both for ML and Scala):
http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/Admin/bash_proc...
The main idea is to use plain OS.Process.system (Posix) or Windows.simpleExecute with a Thread.fork around it. Thus a synchronous process execution is management asynchronously by a separate thread.
In the implementation, the basic idea of Process.system + Thread.fork is further complicated due to requirements for:
* uniform use of bash on all platforms (including Debian/Ubuntu and Windows) * proper handling of signals (interrupts) * management of the external process as a group * timing information for the external process
Makarius
On 05/06/2017 11:27, Makarius wrote:
On 05/06/17 10:35, Michael.Norrish at data61.csiro.au wrote:
Both this (with Posix.Process open):
case fork() of NONE => (exece(proc,proc::args,env')
and this:
case fork() of exece(executable,nm_args,env)
seem to be causing us grief, particularly on a recent Linux, PolyML 5.7 and under conditions of some memory load.
Is there a better idiom we might use?
For Isabelle, I have made many variations and refinements in the past 10 years -- without fork/exec due to portability problems. The present state in this quest for perfection is here (both for ML and Scala):
http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/Admin/bash_proc...
The main idea is to use plain OS.Process.system (Posix) or Windows.simpleExecute with a Thread.fork around it. Thus a synchronous process execution is management asynchronously by a separate thread.
I would certainly recommend the use of OS.Process.system rather than fork and exec. I wonder if there is a problem if there is a need for a GC or some other action before the child process can actually perform the exece ? Looking at the code, Poly/ML sets a "singleThreaded" flag in the child process to indicate that it should not expect any other threads to be running but there is a lot of state in the run-time system associated with threads. It may be that this state would have to be explicitly reset.
OS.Process.system does use fork/exec underneath on Unix but the exec occurs immediately after the fork within the same C function. There is also special-case code for OS.Process.system on Cygwin since fork/exec is very inefficient on Windows.
David
I like the idea of Thread.fork coupled with OS.Process.system except I do want to be able to capture the called processes? output streams dynamically, and to be able to kill the called processes dynamically. Neither of these are possible with system.
I should have perhaps said that the problems actually seem to arise with the end of the child process: waitpid and similar in the parent seem to give rise to odd errors (processes reporting as failed when there?s nothing to indicate that that?s really the case).
Having said that, it would be nice if there were some way to test a more atomic version of the fork/dup2/exec dance. (Could it be done with an FFI call to C code that made all the calls in one function?)
Michael
On 5/6/17, 22:01, "polyml-bounces at inf.ed.ac.uk on behalf of David Matthews" <polyml-bounces at inf.ed.ac.uk on behalf of David.Matthews at prolingua.co.uk> wrote:
On 05/06/2017 11:27, Makarius wrote: > On 05/06/17 10:35, Michael.Norrish at data61.csiro.au wrote: >> Both this (with Posix.Process open): >> >> case fork() of >> NONE => (exece(proc,proc::args,env') >> >> and this: >> >> case fork() of >> exece(executable,nm_args,env) >> >> seem to be causing us grief, particularly on a recent Linux, PolyML 5.7 and under conditions of some memory load. >> >> Is there a better idiom we might use? > > For Isabelle, I have made many variations and refinements in the past 10 > years -- without fork/exec due to portability problems. The present > state in this quest for perfection is here (both for ML and Scala): > > http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... > http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/src/Pure/System... > http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2016-1/Admin/bash_proc... > > > The main idea is to use plain OS.Process.system (Posix) or > Windows.simpleExecute with a Thread.fork around it. Thus a synchronous > process execution is management asynchronously by a separate thread.
I would certainly recommend the use of OS.Process.system rather than fork and exec. I wonder if there is a problem if there is a need for a GC or some other action before the child process can actually perform the exece ? Looking at the code, Poly/ML sets a "singleThreaded" flag in the child process to indicate that it should not expect any other threads to be running but there is a lot of state in the run-time system associated with threads. It may be that this state would have to be explicitly reset.
OS.Process.system does use fork/exec underneath on Unix but the exec occurs immediately after the fork within the same C function. There is also special-case code for OS.Process.system on Cygwin since fork/exec is very inefficient on Windows.
David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml