Hi, I've just pushed a collection of changes to master that have been in the pipeline for quite a long time. Some of these are internal changes to the run-time system and some are extensions, such as the addition of IPv6 networking with INet6Sock and Net6HostDB structures.
The major change, though, is with the foreign function interface. On X86 platforms libffi is no longer used and the version of it included in the libpolyml directory has been removed. Libffi is still used in the interpreted version but only if the library is installed on the system.
Instead the foreign function interface is handled essentially as part of the compiler. The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions. This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64. The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
The reason for the speed-up is that the interface has to place the arguments in the correct registers for the ABI and the rules for placing arguments can be quite complicated, particular on the X86/64 on Unix. Libffi computes the placement on every call whereas the compiler can do this once and build code that moves the arguments into the right registers and returns the result.
For backwards compatibility buildClosureN functions have been retained but these are wrappers around new buildCallback functions. The buildCallback functions differ in two respects. Closures created with buildCallback are garbage-collected which means that if they are used to register callbacks with a C library it may be necessary to keep a reference in ML. There is a touchClosure function that should be called when the callback is no longer needed. For compatibility closures created with buildClosure are retained in a global list to avoid garbage collection.
The other difference is that the buildCallback functions have a slightly different type from buildClosure and that reflects the underlying implementation. For example, val buildCallback1: 'a conversion * 'b conversion -> ('a -> 'b) -> ('a -> 'b) closure The first application builds the interface code that handle the conversion between the ML and C ABIs. The second application applies this to an ML function to build a closure value that can be passed to C. This second application builds a small additional piece of code that simply loads the address of the ML function into a register and jumps to the interface code. What this means is that while the first application should always be done at the top level it is possible to embed an application of this to a particular ML function inside ML code. There is still an overhead compared with creating a closure in ML and it is better if possible to do the application at the top level but the cost is significantly less than if the whole buildCallback function were called within a function.
As always please give this a try and let me know if there are problems. Regards, David
Hello. I cannot build last git version on FreeBSD 32bit:
libtool: link: c++ -fPIC -DPIC -shared -nostdlib /usr/lib/crti.o /usr/lib/crtbeginS.o .libs/arb.o .libs/bitmap.o .libs/check_objects.o .libs/diagnostics.o .libs/errors.o .libs/exporter.o .libs/gc.o .libs/gc_check_weak_ref.o .libs/gc_copy_phase.o .libs/gc_mark_phase.o .libs/gc_progress.o .libs/gc_share_phase.o .libs/gc_update_phase.o .libs/gctaskfarm.o .libs/heapsizing.o .libs/locking.o .libs/memmgr.o .libs/mpoly.o .libs/network.o .libs/objsize.o .libs/pexport.o .libs/poly_specific.o .libs/polyffi.o .libs/polystring.o .libs/process_env.o .libs/processes.o .libs/profiling.o .libs/quick_gc.o .libs/realconv.o .libs/reals.o .libs/rts_module.o .libs/rtsentry.o .libs/run_time.o .libs/save_vec.o .libs/savestate.o .libs/scanaddrs.o .libs/sharedata.o .libs/sighandler.o .libs/statistics.o .libs/timing.o .libs/xwindows.o .libs/x86_dep.o .libs/x86assembly_gas32.o .libs/elfexport.o .libs/basicio.o .libs/unix_specific.o .libs/osmemunix.o -L/usr/local/lib -lpthread -lffi -lgmp -L/usr/lib -lc++ -lm -lc -lgcc -lgcc_s /usr/lib/crtendS.o /usr/lib/crtn.o -O3 -Wl,-soname -Wl,libpolyml.so.11 -o .libs/libpolyml.so.11.0.0 ld: error: relocation R_386_PC32 cannot be used against symbol X86TrapHandler; recompile with -fPIC
defined in .libs/x86_dep.o referenced by .libs/x86assembly_gas32.o:(.text+0x4E)
c++: error: linker command failed with exit code 1 (use -v to see invocation)
P.S.
polyml-5.8.1 compiles and works well.
??, 19 ???. 2020 ?. ? 14:14, David Matthews <David.Matthews at prolingua.co.uk>:
Hi, I've just pushed a collection of changes to master that have been in the pipeline for quite a long time. Some of these are internal changes to the run-time system and some are extensions, such as the addition of IPv6 networking with INet6Sock and Net6HostDB structures.
The major change, though, is with the foreign function interface. On X86 platforms libffi is no longer used and the version of it included in the libpolyml directory has been removed. Libffi is still used in the interpreted version but only if the library is installed on the system.
Instead the foreign function interface is handled essentially as part of the compiler. The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions. This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64. The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
The reason for the speed-up is that the interface has to place the arguments in the correct registers for the ABI and the rules for placing arguments can be quite complicated, particular on the X86/64 on Unix. Libffi computes the placement on every call whereas the compiler can do this once and build code that moves the arguments into the right registers and returns the result.
For backwards compatibility buildClosureN functions have been retained but these are wrappers around new buildCallback functions. The buildCallback functions differ in two respects. Closures created with buildCallback are garbage-collected which means that if they are used to register callbacks with a C library it may be necessary to keep a reference in ML. There is a touchClosure function that should be called when the callback is no longer needed. For compatibility closures created with buildClosure are retained in a global list to avoid garbage collection.
The other difference is that the buildCallback functions have a slightly different type from buildClosure and that reflects the underlying implementation. For example, val buildCallback1: 'a conversion * 'b conversion -> ('a -> 'b) -> ('a -> 'b) closure The first application builds the interface code that handle the conversion between the ML and C ABIs. The second application applies this to an ML function to build a closure value that can be passed to C. This second application builds a small additional piece of code that simply loads the address of the ML function into a register and jumps to the interface code. What this means is that while the first application should always be done at the top level it is possible to embed an application of this to a particular ML function inside ML code. There is still an overhead compared with creating a closure in ML and it is better if possible to do the application at the top level but the cost is significantly less than if the whole buildCallback function were called within a function.
As always please give this a try and let me know if there are problems. Regards, David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
On 20/10/2020 13:33, Kostirya wrote:
Hello. I cannot build last git version on FreeBSD 32bit:
libtool: link: c++ -fPIC -DPIC -shared -nostdlib /usr/lib/crti.o /usr/lib/crtbeginS.o .libs/arb.o .libs/bitmap.o .libs/check_objects.o .libs/diagnostics.o .libs/errors.o .libs/exporter.o .libs/gc.o .libs/gc_check_weak_ref.o .libs/gc_copy_phase.o .libs/gc_mark_phase.o .libs/gc_progress.o .libs/gc_share_phase.o .libs/gc_update_phase.o .libs/gctaskfarm.o .libs/heapsizing.o .libs/locking.o .libs/memmgr.o .libs/mpoly.o .libs/network.o .libs/objsize.o .libs/pexport.o .libs/poly_specific.o .libs/polyffi.o .libs/polystring.o .libs/process_env.o .libs/processes.o .libs/profiling.o .libs/quick_gc.o .libs/realconv.o .libs/reals.o .libs/rts_module.o .libs/rtsentry.o .libs/run_time.o .libs/save_vec.o .libs/savestate.o .libs/scanaddrs.o .libs/sharedata.o .libs/sighandler.o .libs/statistics.o .libs/timing.o .libs/xwindows.o .libs/x86_dep.o .libs/x86assembly_gas32.o .libs/elfexport.o .libs/basicio.o .libs/unix_specific.o .libs/osmemunix.o -L/usr/local/lib -lpthread -lffi -lgmp -L/usr/lib -lc++ -lm -lc -lgcc -lgcc_s /usr/lib/crtendS.o /usr/lib/crtn.o -O3 -Wl,-soname -Wl,libpolyml.so.11 -o .libs/libpolyml.so.11.0.0 ld: error: relocation R_386_PC32 cannot be used against symbol X86TrapHandler; recompile with -fPIC
defined in .libs/x86_dep.o referenced by .libs/x86assembly_gas32.o:(.text+0x4E)
c++: error: linker command failed with exit code 1 (use -v to see invocation)
I've managed to get FreeBSD 32-bit up and running in VirtualBox. It seems that LDFLAGS="-Wl,-z -Wl,notext" gets past that step but there's a problem later on. I was getting ld: error: unable to find library -lstdc++ and managed to get past it manually but that's not satisfactory.
David
Hello.
LDFLAGS="-Wl,-z -Wl,notext" is helped. Thanks.
ld automatically links stdc++ on FreeBSD. So I am doing: sed -i.bak -e 's|-lstdc++ ||' configure sed -i.bak -e 's| modules||' Makefile.in
And I build so:
env CFLAGS=-I/usr/local/include LDFLAGS="-L/usr/local/lib -Wl,-z -Wl,notext" ./configure --with-gmp --with-system-libffi make && make compiler && make compiler && make tests
But now the 3G memory is not enough to build the git version of polyml:
cp ./imports/polymli386.txt polytemp.txt ./polyimport polytemp.txt -I . < ./exportPoly.sml
Unable to create the initial thread - insufficient memory
??, 20 ???. 2020 ?. ? 20:29, David Matthews <David.Matthews at prolingua.co.uk>:
On 20/10/2020 13:33, Kostirya wrote:
Hello. I cannot build last git version on FreeBSD 32bit:
libtool: link: c++ -fPIC -DPIC -shared -nostdlib /usr/lib/crti.o /usr/lib/crtbeginS.o .libs/arb.o .libs/bitmap.o .libs/check_objects.o .libs/diagnostics.o .libs/errors.o .libs/exporter.o .libs/gc.o .libs/gc_check_weak_ref.o .libs/gc_copy_phase.o .libs/gc_mark_phase.o .libs/gc_progress.o .libs/gc_share_phase.o .libs/gc_update_phase.o .libs/gctaskfarm.o .libs/heapsizing.o .libs/locking.o .libs/memmgr.o .libs/mpoly.o .libs/network.o .libs/objsize.o .libs/pexport.o .libs/poly_specific.o .libs/polyffi.o .libs/polystring.o .libs/process_env.o .libs/processes.o .libs/profiling.o .libs/quick_gc.o .libs/realconv.o .libs/reals.o .libs/rts_module.o .libs/rtsentry.o .libs/run_time.o .libs/save_vec.o .libs/savestate.o .libs/scanaddrs.o .libs/sharedata.o .libs/sighandler.o .libs/statistics.o .libs/timing.o .libs/xwindows.o .libs/x86_dep.o .libs/x86assembly_gas32.o .libs/elfexport.o .libs/basicio.o .libs/unix_specific.o .libs/osmemunix.o -L/usr/local/lib -lpthread -lffi -lgmp -L/usr/lib -lc++ -lm -lc -lgcc -lgcc_s /usr/lib/crtendS.o /usr/lib/crtn.o -O3 -Wl,-soname -Wl,libpolyml.so.11 -o .libs/libpolyml.so.11.0.0 ld: error: relocation R_386_PC32 cannot be used against symbol X86TrapHandler; recompile with -fPIC
defined in .libs/x86_dep.o referenced by .libs/x86assembly_gas32.o:(.text+0x4E)
c++: error: linker command failed with exit code 1 (use -v to see invocation)
I've managed to get FreeBSD 32-bit up and running in VirtualBox. It seems that LDFLAGS="-Wl,-z -Wl,notext" gets past that step but there's a problem later on. I was getting ld: error: unable to find library -lstdc++ and managed to get past it manually but that's not satisfactory.
David
On 21/10/2020 07:20, Kostirya wrote:
But now the 3G memory is not enough to build the git version of polyml:
cp ./imports/polymli386.txt polytemp.txt ./polyimport polytemp.txt -I . < ./exportPoly.sml
Unable to create the initial thread - insufficient memory
This appears to be a problem with allocating memory for the stack. The call to mmap is failing with EINVAL but I can't see why. It's line 407 in libpolyml/osmemunix.cpp which adds MAP_STACK to the arguments. This is necessary for OpenBSD which segfaults if the stack is not allocated with MAP_STACK but commenting it out in FreeBSD seems to solve the problem.
David
??, 21 ???. 2020 ?. ? 15:23, David Matthews <David.Matthews at prolingua.co.uk>:
On 21/10/2020 07:20, Kostirya wrote:
But now the 3G memory is not enough to build the git version of polyml:
cp ./imports/polymli386.txt polytemp.txt ./polyimport polytemp.txt -I . < ./exportPoly.sml
Unable to create the initial thread - insufficient memory
This appears to be a problem with allocating memory for the stack. The call to mmap is failing with EINVAL but I can't see why. It's line 407 in libpolyml/osmemunix.cpp which adds MAP_STACK to the arguments. This is necessary for OpenBSD which segfaults if the stack is not allocated with MAP_STACK but commenting it out in FreeBSD seems to solve the problem.
David
I asked in FreeBSD mail list and got got an answer:
kdump with MAP_STACK.
87183 polyimport CALL mmap(0,0x1000,0x3<PROT_READ|PROT_WRITE>,0x1402<MAP_PRIVATE|MAP_STACK|MAP_ANON>,0xffffffff,0,0) 87183 polyimport RET mmap -1 errno 22 Invalid argument
So it is anything but 'insufficient memory' (I suspected ENOMEM). EINVAL there is because sysctl security.bsd.stack_guard_page default value is 1, which means that at least one page of the stack is reserved as guard. Kernel does not allow to map stack that would have no data pages (all pages are guard).
Your mapping request is for one page, and one page is due to guard, so you get EINVAL. Generally MAP_STACK is magic and requires caller to know what it does.
This appears to be a problem with allocating memory for the stack. The call to mmap is failing with EINVAL but I can't see why. It's line 407 in libpolyml/osmemunix.cpp which adds MAP_STACK to the arguments. This is necessary for OpenBSD which segfaults if the stack is not allocated with MAP_STACK but commenting it out in FreeBSD seems to solve the problem.
So it is anything but 'insufficient memory' (I suspected ENOMEM). EINVAL there is because sysctl security.bsd.stack_guard_page default value is 1, which means that at least one page of the stack is reserved as guard. Kernel does not allow to map stack that would have no data pages (all pages are guard).
Your mapping request is for one page, and one page is due to guard, so you get EINVAL. Generally MAP_STACK is magic and requires caller to know what it does.
I've changed this so that MAP_STACK is only used on OpenBSD where it is necessary and appears to be happy with a single page. ./configure --disable-shared && make && make compiler now works on FreeBSD 32 without needing any other options.
David
??, 22 ???. 2020 ?. ? 13:33, David Matthews <David.Matthews at prolingua.co.uk>:
./configure --disable-shared && make && make compiler now works on FreeBSD 32 without needing any other options.
Thanks.
But if I want to use gmp, then I still have to specify env CFLAGS=-I/usr/local/include LDFLAGS=-L/usr/local/lib
Without them, I get: ./configure --with-gmp --disable-shared ... configure: error: --with-gmp was given, but gmp library (version 4 or later) is not installed
On 19/10/2020 13:12, David Matthews wrote:
Instead the foreign function interface is handled essentially as part of the compiler.? The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions.? This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64.? The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
I have adopted a current version from the Poly/ML repository, see https://isabelle-dev.sketis.net/rISABELLE63ec86626ec3
This did not require any changes, but I also started to experiment with clear division of the compile-time vs. run-time of Foreign calls. With mixed results, ending up to dismiss the attempt for now (Isabelle/18eed4f718e0).
Some problems encountered so far:
* Interpreted arm64-linux does not quite work. A statically compiled Foreign.buildCall within the heap image causes the dynamic invocation to "hang"; e.g. see https://isabelle-dev.sketis.net/rISABELLE7cb68b5b103d
The problem (before the above change) can be reproduced on Raspberry Pi 4 / PI OS 64bit like this:
$ isabelle build Pure $ isabelle console -l Pure ML> SHA1.digest "" (* hangs *)
* Native x86_64_32-windows: building an image on one Windows server installation and running it on another one (Windows 10) caused an error in accessing the sha1.dll (different file location, potentially different load addresses).
* I did not test linux and macos in that respect yet, but wonder if loading symbols from a shared library, and storing the result in the ML heap image can be portable over processes and OS installations.
That is just my feedback for now. I guess we won't need the division of compiletime/runtime in Isabelle, because the only Foreign call is SHA1.digest, used on a few big blobs, and not invoked too often.
Makarius
On 02/11/2020 17:05, Makarius wrote:
On 19/10/2020 13:12, David Matthews wrote: Some problems encountered so far:
- Interpreted arm64-linux does not quite work. A statically compiled
Foreign.buildCall within the heap image causes the dynamic invocation to "hang"; e.g. see https://isabelle-dev.sketis.net/rISABELLE7cb68b5b103d
The problem (before the above change) can be reproduced on Raspberry Pi 4
/ PI OS 64bit like this:
$ isabelle build Pure $ isabelle console -l Pure ML> SHA1.digest "" (* hangs *)
The interpreted version still uses libffi and that hadn't been tested as well as the compiled X86 version. I've fixed a couple of problems and it now seems fine.
- Native x86_64_32-windows: building an image on one Windows server
installation and running it on another one (Windows 10) caused an error in accessing the sha1.dll (different file location, potentially different load addresses).
This is odd. I've been running some tests with various X86 platforms and not seen anything like this.
- I did not test linux and macos in that respect yet, but wonder if loading
symbols from a shared library, and storing the result in the ML heap image can be portable over processes and OS installations.
Only the conversion function is stored in the heap. The entry point to the function is still found using lazy loading just as it was in the old version. When a foreign function is first used in a session the library is loaded and the symbol is looked up. From then on the cached value is used but only in the same session.
That is just my feedback for now. I guess we won't need the division of compiletime/runtime in Isabelle, because the only Foreign call is SHA1.digest, used on a few big blobs, and not invoked too often.
It's not just efficiency, the interpreted version will leak C memory if the build functions are repeatedly called. This was the case before the recent changes on the X86 as well. It isn't any longer because the conversion functions on the X86 can be garbage collected.
David
On 04/11/2020 18:10, David Matthews wrote:
?? * Native x86_64_32-windows: building an image on one Windows server installation and running it on another one (Windows 10) caused an error in accessing the sha1.dll (different file location, potentially different load addresses).
This is odd.? I've been running some tests with various X86 platforms and not seen anything like this.
Isabelle/81518b38b316 is back to the static invocation: https://isabelle-dev.sketis.net/rISABELLE81518b38b316 --- it also uses an updated polyml-test-7e49fce62e3d.
The above problem is rather profane: I am using Foreign.loadLibrary with the symbolic path "$ML_HOME/sha1.dll" (or .so), but that gets normalized at compile-time. Later at run-time, the Isabelle directory hierarchy might have been moved elsewhere (e.g. a user downloading our pre-built distribution and unpacking it locally).
You can try it with current https://isabelle.sketis.net/devel/release_snapshot (Isabelle/653ac845b466) e.g. on Linux:
$ Isabelle_07-Nov-2020/bin/isabelle console -l Pure Poly/ML> SHA1.digest ""; ### Loading </tmp/tmp.9wQavOmIEp/contrib/polyml-test-7e49fce62e3d/x86_64_32-linux/libsha1.so> failed: /tmp/tmp.9wQavOmIEp/contrib/polyml-test-7e49fce62e3d/x86_64_32-linux/libsha1.so: cannot open shared object file: No such file or directory ### Using slow ML implementation of SHA1.digest val it = "da39a3ee5e6b4b0d3255bfef95601890afd80709": SHA1.digest
(The tmp-directory is from the automatic build process.)
That is just my feedback for now. I guess we won't need the division of compiletime/runtime in Isabelle, because the only Foreign call is SHA1.digest, used on a few big blobs, and not invoked too often.
It's not just efficiency, the interpreted version will leak C memory if the build functions are repeatedly called.? This was the case before the recent changes on the X86 as well.? It isn't any longer because the conversion functions on the X86 can be garbage collected.
This means we need to get this conceptually right, and cannot just sweep it under the carpet.
Makarius
On 07/11/2020 11:56, Makarius wrote:
The above problem is rather profane: I am using Foreign.loadLibrary with the symbolic path "$ML_HOME/sha1.dll" (or .so), but that gets normalized at compile-time. Later at run-time, the Isabelle directory hierarchy might have been moved elsewhere (e.g. a user downloading our pre-built distribution and unpacking it locally).
I can see the problem. The path is captured when the code is compiled.
This isn't a big problem but fixing it needs a small change to the Foreign structure.
The key to understanding this is that there is a difference between Foreign.System.loadLibrary/getSymbol and Foreign.loadLibrary/getSymbol. The former call the underlying system calls immediately to get the address, the latter don't. Instead they create functions that only call the system when those functions are themselves called. The result is then cached. Internally in the high-level Foreign structure the "library" and "symbol" types are both defined as "unit->voidStar". This means that although buildCall1, say, takes a "symbol" as an argument and compiles an interface function containing the "symbol" it is actually compiling a function that needs to call a function to get the address.
Underneath all this is the Foreign.Memory.volatileRef which is a special kind of ref that can contain a C address but is always cleared to zero at the start of a session. This can be used to cache the address of the entry point to a function that could be different in different runs or on different machines. If a volatileRef is written out with PolyML.SaveState.saveState it will always be reset to zero when it is loaded in with PolyML.SaveState.loadState.
What is needed is a hook so that when Foreign.library actually calls Foreign.System.loadLibrary it first calls your function to get the path. It would do this once immediately before calling "sha1_buffer" for the first time during any run.
I'll think how best to do this.
David
On 09/11/2020 17:01, David Matthews wrote:
What is needed is a hook so that when Foreign.library actually calls Foreign.System.loadLibrary it first calls your function to get the path. ?It would do this once immediately before calling "sha1_buffer" for the first time during any run.
I've added Foreign.loadLibraryIndirect which takes a unit->string function to supply the path name. This function is called just before the library is actually loaded when the foreign function is first run and should do what you want. Give it a try and see how it is.
David
On 14/11/2020 12:10, David Matthews wrote:
On 09/11/2020 17:01, David Matthews wrote:
What is needed is a hook so that when Foreign.library actually calls Foreign.System.loadLibrary it first calls your function to get the path. ??It would do this once immediately before calling "sha1_buffer" for the first time during any run.
I've added Foreign.loadLibraryIndirect which takes a unit->string function to supply the path name.? This function is called just before the library is actually loaded when the foreign function is first run and should do what you want.? Give it a try and see how it is.
I am using that in https://isabelle-dev.sketis.net/rISABELLEfca4d6abebda and it looks fine.
Makarius
I have finally tried out the new FFI with Giraffe Library with partial success. For some examples, calls to C and callbacks from C are working but other examples result a seg. fault. From the debug output, I noticed that the seg. faults occur when a callback occurs during a callback. I've attached a small example that demonstrates the issue (call_c_test_16.tar.gz). (Although this example uses dynamic loading, the same happens if dynamic linking is used.) The backtrace from gdb provided no useful information so I didn't investigate further.
Also, I have some minor observations about the interface. I note that the signature FOREIGN specifies: val touchClosure: 'a -> unit I wondered whether that should be val touchClosure: 'a closure -> unit (RunCall.touch is visible in the Poly/ML top-level which has the type of the former, so there is no loss of capability with the latter, which would catch cases where touchClosure is applied to the wrong value.)
In both the old and new Foreign modules, the type `'a Foreign.closure` is abstract. Giraffe Library uses `Foreign.LowLevel.cFunctionWithAbi` define its own function for creating a closure but there is no way to create a `'a Foreign.closure` value from a `Memory.voidStar` value. This is easily worked around by copying the type declaration and definition of `Foreign.cFunction` but I wondered if there could be a way to avoid this copying.
In the past, I found it useful to have val nullClosure : 'a closure This is easily declared with one's own closure type but if using Foreign.closure, it may be useful to have in Foreign. Giraffe Library no longer needs this since I changed the callback mechanism to avoid the need to free closures, not realizing this would become available a month or two later! https://github.com/giraffelibrary/giraffe/commit/2dc239946c77bdf8cb8b55223f9...
Regards, Phil
On 19/10/20 12:12, David Matthews wrote:
Hi, I've just pushed a collection of changes to master that have been in the pipeline for quite a long time.? Some of these are internal changes to the run-time system and some are extensions, such as the addition of IPv6 networking with INet6Sock and Net6HostDB structures.
The major change, though, is with the foreign function interface.? On X86 platforms libffi is no longer used and the version of it included in the libpolyml directory has been removed.? Libffi is still used in the interpreted version but only if the library is installed on the system.
Instead the foreign function interface is handled essentially as part of the compiler.? The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions.? This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64.? The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
The reason for the speed-up is that the interface has to place the arguments in the correct registers for the ABI and the rules for placing arguments can be quite complicated, particular on the X86/64 on Unix. Libffi computes the placement on every call whereas the compiler can do this once and build code that moves the arguments into the right registers and returns the result.
For backwards compatibility buildClosureN functions have been retained but these are wrappers around new buildCallback functions.? The buildCallback functions differ in two respects.? Closures created with buildCallback are garbage-collected which means that if they are used to register callbacks with a C library it may be necessary to keep a reference in ML.? There is a touchClosure function that should be called when the callback is no longer needed.? For compatibility closures created with buildClosure are retained in a global list to avoid garbage collection.
The other difference is that the buildCallback functions have a slightly different type from buildClosure and that reflects the underlying implementation.? For example, val buildCallback1: 'a conversion * 'b conversion -> ??? ('a -> 'b) -> ??????? ('a -> 'b) closure The first application builds the interface code that handle the conversion between the ML and C ABIs.? The second application applies this to an ML function to build a closure value that can be passed to C. ?This second application builds a small additional piece of code that simply loads the address of the ML function into a register and jumps to the interface code.? What this means is that while the first application should always be done at the top level it is possible to embed an application of this to a particular ML function inside ML code.? There is still an overhead compared with creating a closure in ML and it is better if possible to do the application at the top level but the cost is significantly less than if the whole buildCallback function were called within a function.
As always please give this a try and let me know if there are problems. Regards, David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
In testing callbacks during callbacks, I have also tried a rather contrived example (attached) where the SML function closures to call back are passed down as arguments. It's not something that I have needed to do. I simply cannot get this to work with either 5.8.1 or the latest version in master. With both versions I see the following output:
An ML function called from foreign code raised an exception. Unable to continue. call_c_test_15: diagnostics.cpp:128: void Crash(const char*, ...): Assertion `0' failed.
I see this even if I wrap exception handlers around the called-back functions to ensure that they cannot raise an exception.
I may well be doing something wrong in the example but I can't see what it is. I've mentioned it in case it highlights an issue.
Regards, Phil
On 18/02/21 23:57, Phil Clayton wrote:
I have finally tried out the new FFI with Giraffe Library with partial success.? For some examples, calls to C and callbacks from C are working but other examples result a seg. fault.? From the debug output, I noticed that the seg. faults occur when a callback occurs during a callback.? I've attached a small example that demonstrates the issue (call_c_test_16.tar.gz).? (Although this example uses dynamic loading, the same happens if dynamic linking is used.)? The backtrace from gdb provided no useful information so I didn't investigate further.
Also, I have some minor observations about the interface.? I note that the signature FOREIGN specifies: ??? val touchClosure: 'a -> unit I wondered whether that should be ??? val touchClosure: 'a closure -> unit (RunCall.touch is visible in the Poly/ML top-level which has the type of the former, so there is no loss of capability with the latter, which would catch cases where touchClosure is applied to the wrong value.)
In both the old and new Foreign modules, the type `'a Foreign.closure` is abstract.? Giraffe Library uses `Foreign.LowLevel.cFunctionWithAbi` define its own function for creating a closure but there is no way to create a `'a Foreign.closure` value from a `Memory.voidStar` value. This is easily worked around by copying the type declaration and definition of `Foreign.cFunction` but I wondered if there could be a way to avoid this copying.
In the past, I found it useful to have ??? val nullClosure : 'a closure This is easily declared with one's own closure type but if using Foreign.closure, it may be useful to have in Foreign.? Giraffe Library no longer needs this since I changed the callback mechanism to avoid the need to free closures, not realizing this would become available a month or two later! https://github.com/giraffelibrary/giraffe/commit/2dc239946c77bdf8cb8b55223f9...
Regards, Phil
On 19/10/20 12:12, David Matthews wrote:
Hi, I've just pushed a collection of changes to master that have been in the pipeline for quite a long time.? Some of these are internal changes to the run-time system and some are extensions, such as the addition of IPv6 networking with INet6Sock and Net6HostDB structures.
The major change, though, is with the foreign function interface.? On X86 platforms libffi is no longer used and the version of it included in the libpolyml directory has been removed.? Libffi is still used in the interpreted version but only if the library is installed on the system.
Instead the foreign function interface is handled essentially as part of the compiler.? The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions.? This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64.? The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
The reason for the speed-up is that the interface has to place the arguments in the correct registers for the ABI and the rules for placing arguments can be quite complicated, particular on the X86/64 on Unix. Libffi computes the placement on every call whereas the compiler can do this once and build code that moves the arguments into the right registers and returns the result.
For backwards compatibility buildClosureN functions have been retained but these are wrappers around new buildCallback functions.? The buildCallback functions differ in two respects.? Closures created with buildCallback are garbage-collected which means that if they are used to register callbacks with a C library it may be necessary to keep a reference in ML.? There is a touchClosure function that should be called when the callback is no longer needed.? For compatibility closures created with buildClosure are retained in a global list to avoid garbage collection.
The other difference is that the buildCallback functions have a slightly different type from buildClosure and that reflects the underlying implementation.? For example, val buildCallback1: 'a conversion * 'b conversion -> ???? ('a -> 'b) -> ???????? ('a -> 'b) closure The first application builds the interface code that handle the conversion between the ML and C ABIs.? The second application applies this to an ML function to build a closure value that can be passed to C. ??This second application builds a small additional piece of code that simply loads the address of the ML function into a register and jumps to the interface code.? What this means is that while the first application should always be done at the top level it is possible to embed an application of this to a particular ML function inside ML code.? There is still an overhead compared with creating a closure in ML and it is better if possible to do the application at the top level but the cost is significantly less than if the whole buildCallback function were called within a function.
As always please give this a try and let me know if there are problems. Regards, David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
I hacked PolyFFICallbackException to print the exception (see attached) and found that it was Foreign "Cannot return a closure" It's now obvious what's going wrong now in call_c_test_15. The load function for a closure conversion raises the exception to prevent a call to C returning a closure (hence the wording in the exception) but it also prevents a callback function taking a closure as an argument.
It would be useful if PolyFFICallbackException printed the exception before aborting.
On 19/02/21 16:19, Phil Clayton wrote:
In testing callbacks during callbacks, I have also tried a rather contrived example (attached) where the SML function closures to call back are passed down as arguments.? It's not something that I have needed to do.? I simply cannot get this to work with either 5.8.1 or the latest version in master.? With both versions I see the following output:
An ML function called from foreign code raised an exception.? Unable to continue. call_c_test_15: diagnostics.cpp:128: void Crash(const char*, ...): Assertion `0' failed.
I see this even if I wrap exception handlers around the called-back functions to ensure that they cannot raise an exception.
I may well be doing something wrong in the example but I can't see what it is.? I've mentioned it in case it highlights an issue.
Regards, Phil
On 18/02/21 23:57, Phil Clayton wrote:
I have finally tried out the new FFI with Giraffe Library with partial success.? For some examples, calls to C and callbacks from C are working but other examples result a seg. fault.? From the debug output, I noticed that the seg. faults occur when a callback occurs during a callback.? I've attached a small example that demonstrates the issue (call_c_test_16.tar.gz).? (Although this example uses dynamic loading, the same happens if dynamic linking is used.)? The backtrace from gdb provided no useful information so I didn't investigate further.
Also, I have some minor observations about the interface.? I note that the signature FOREIGN specifies: ???? val touchClosure: 'a -> unit I wondered whether that should be ???? val touchClosure: 'a closure -> unit (RunCall.touch is visible in the Poly/ML top-level which has the type of the former, so there is no loss of capability with the latter, which would catch cases where touchClosure is applied to the wrong value.)
In both the old and new Foreign modules, the type `'a Foreign.closure` is abstract.? Giraffe Library uses `Foreign.LowLevel.cFunctionWithAbi` define its own function for creating a closure but there is no way to create a `'a Foreign.closure` value from a `Memory.voidStar` value. This is easily worked around by copying the type declaration and definition of `Foreign.cFunction` but I wondered if there could be a way to avoid this copying.
In the past, I found it useful to have ???? val nullClosure : 'a closure This is easily declared with one's own closure type but if using Foreign.closure, it may be useful to have in Foreign.? Giraffe Library no longer needs this since I changed the callback mechanism to avoid the need to free closures, not realizing this would become available a month or two later! https://github.com/giraffelibrary/giraffe/commit/2dc239946c77bdf8cb8b55223f9...
Regards, Phil
On 19/10/20 12:12, David Matthews wrote:
Hi, I've just pushed a collection of changes to master that have been in the pipeline for quite a long time.? Some of these are internal changes to the run-time system and some are extensions, such as the addition of IPv6 networking with INet6Sock and Net6HostDB structures.
The major change, though, is with the foreign function interface.? On X86 platforms libffi is no longer used and the version of it included in the libpolyml directory has been removed.? Libffi is still used in the interpreted version but only if the library is installed on the system.
Instead the foreign function interface is handled essentially as part of the compiler.? The high-level interface in the Foreign structure remains unchanged but the buildCallN functions now actually compile interface functions.? This results in foreign function calls being substantially faster than with libffi; at least 10 times faster for trivial calls on the X86/64.? The cost is, of course, some extra work when buildCallN is called, meaning that it is essential that these functions are only used at the top level.
The reason for the speed-up is that the interface has to place the arguments in the correct registers for the ABI and the rules for placing arguments can be quite complicated, particular on the X86/64 on Unix. Libffi computes the placement on every call whereas the compiler can do this once and build code that moves the arguments into the right registers and returns the result.
For backwards compatibility buildClosureN functions have been retained but these are wrappers around new buildCallback functions. The buildCallback functions differ in two respects.? Closures created with buildCallback are garbage-collected which means that if they are used to register callbacks with a C library it may be necessary to keep a reference in ML.? There is a touchClosure function that should be called when the callback is no longer needed.? For compatibility closures created with buildClosure are retained in a global list to avoid garbage collection.
The other difference is that the buildCallback functions have a slightly different type from buildClosure and that reflects the underlying implementation.? For example, val buildCallback1: 'a conversion * 'b conversion -> ???? ('a -> 'b) -> ???????? ('a -> 'b) closure The first application builds the interface code that handle the conversion between the ML and C ABIs.? The second application applies this to an ML function to build a closure value that can be passed to C. ??This second application builds a small additional piece of code that simply loads the address of the ML function into a register and jumps to the interface code.? What this means is that while the first application should always be done at the top level it is possible to embed an application of this to a particular ML function inside ML code.? There is still an overhead compared with creating a closure in ML and it is better if possible to do the application at the top level but the cost is significantly less than if the whole buildCallback function were called within a function.
As always please give this a try and let me know if there are problems. Regards, David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
I've pushed some changes which should have fixed most of these issues. Thanks for reporting them.
On 22/02/2021 09:40, Phil Clayton wrote:
I hacked PolyFFICallbackException to print the exception (see attached) and found that it was ? Foreign "Cannot return a closure" It's now obvious what's going wrong now in call_c_test_15.? The load function for a closure conversion raises the exception to prevent a call to C returning a closure (hence the wording in the exception) but it also prevents a callback function taking a closure as an argument.
It would be useful if PolyFFICallbackException printed the exception before aborting.
I actually found this earlier this morning just before your message arrived. I've added the exception message string to the abort message.
Also, I have some minor observations about the interface.? I note that the signature FOREIGN specifies: ???? val touchClosure: 'a -> unit I wondered whether that should be ???? val touchClosure: 'a closure -> unit
That seems like a good idea so I've changed it.
In both the old and new Foreign modules, the type `'a Foreign.closure` is abstract.? Giraffe Library uses `Foreign.LowLevel.cFunctionWithAbi` define its own function for creating a closure but there is no way to create a `'a Foreign.closure` value from a `Memory.voidStar` value. This is easily worked around by copying the type declaration and definition of `Foreign.cFunction` but I wondered if there could be a way to avoid this copying.
Can't you just use the cPointer conversion instead of cFunction?
Regards, David
On 22/02/21 15:34, David Matthews wrote:
I've pushed some changes which should have fixed most of these issues. Thanks for reporting them.
Thanks for the updates. I have been testing the Poly/ML variants (see below) and that fixes callbacks within callbacks on x86_64 provided compact32bit is disabled. When compact32bit is enabled, I find that any use of a callback seg. faults, not just nested use. The previous examples call_c_test_15 and call_c_test_16 still demonstrate this. (Observed on both Linux and macOS.)
The following cause callbacks to seg. fault:
--enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=no --enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=yes
The following work fine:
--enable-shared=yes --enable-compact32bit=no --enable-intinf-as-int=no --enable-shared=yes --enable-compact32bit=no --enable-intinf-as-int=yes
In both the old and new Foreign modules, the type `'a Foreign.closure` is abstract.? Giraffe Library uses `Foreign.LowLevel.cFunctionWithAbi` define its own function for creating a closure but there is no way to create a `'a Foreign.closure` value from a `Memory.voidStar` value. This is easily worked around by copying the type declaration and definition of `Foreign.cFunction` but I wondered if there could be a way to avoid this copying.
Can't you just use the cPointer conversion instead of cFunction?
The store of cPointer doesn't return a function that touches the pointer. Although that wouldn't matter for my current uses, where there is always a persistent reference to a closure, I would like the interface to work if there isn't a persistent reference. It's probably not worth changing anything though as we're talking about a few lines of code.
I note that the low-level interface provided by Foreign has changed in a way that is not compatible with the previous version [1]. Clearly some break in interface is unavoidable. Given this, could you confirm that the next version will be 5.8.2? (It seems reasonable not to consider the low-level interface part of the stable API.)
Regards, Phil
1. The low-level interface in Foreign has the following changes: - type LowLevel.ctype has become LowLevel.cType - type LibFFI.abi has become LowLevel.abi - the voidStar arguments of cFunction[WithAbi] and call[WithAbi] have a different representation
On 23/02/2021 13:00, Phil Clayton wrote:
Thanks for the updates.? I have been testing the Poly/ML variants (see below) and that fixes callbacks within callbacks on x86_64 provided compact32bit is disabled.? When compact32bit is enabled, I find that any use of a callback seg. faults, not just nested use.? The previous examples call_c_test_15 and call_c_test_16 still demonstrate this. (Observed on both Linux and macOS.)
The following cause callbacks to seg. fault:
--enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=no --enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=yes
Thanks for reporting that. I've pushed a fix and it looks like it works now.
I note that the low-level interface provided by Foreign has changed in a way that is not compatible with the previous version [1].? Clearly some break in interface is unavoidable.? Given this, could you confirm that the next version will be 5.8.2?? (It seems reasonable not to consider the low-level interface part of the stable API.)
I wanted to keep the high-level interface as stable as possible. The low-level, though, needed to change. I expect the next version will be 5.8.2.
Regards, David
On 05/03/21 13:41, David Matthews wrote:
On 23/02/2021 13:00, Phil Clayton wrote:
Thanks for the updates.? I have been testing the Poly/ML variants (see below) and that fixes callbacks within callbacks on x86_64 provided compact32bit is disabled.? When compact32bit is enabled, I find that any use of a callback seg. faults, not just nested use.? The previous examples call_c_test_15 and call_c_test_16 still demonstrate this. (Observed on both Linux and macOS.)
The following cause callbacks to seg. fault:
--enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=no --enable-shared=yes --enable-compact32bit=yes --enable-intinf-as-int=yes
Thanks for reporting that.? I've pushed a fix and it looks like it works now.
Thanks - I have tested the fix with all variants and have not found any issues on Linux/macOS x86_64.
I note that the binaries using the compiled FFI still have a dependency on libffi. Presumably configure.ac is yet to be updated to make the checks on libffi conditional. (I rebuilt with libffi configuration removed from 'configure' as per the attached diff and there was no dependence on libffi, and the tests, at least on Linux, were still fine.)
I note that the low-level interface provided by Foreign has changed in a way that is not compatible with the previous version [1].? Clearly some break in interface is unavoidable.? Given this, could you confirm that the next version will be 5.8.2?? (It seems reasonable not to consider the low-level interface part of the stable API.)
I wanted to keep the high-level interface as stable as possible.? The low-level, though, needed to change.? I expect the next version will be 5.8.2.
Good to know, thanks.
Regards, Phil
On 08/03/2021 17:29, Phil Clayton wrote:
I note that the binaries using the compiled FFI still have a dependency on libffi.? Presumably configure.ac is yet to be updated to make the checks on libffi conditional.? (I rebuilt with libffi configuration removed from 'configure' as per the attached diff and there was no dependence on libffi, and the tests, at least on Linux, were still fine.)
Thanks for that. libffi is still needed in the interpreted version so I've pushed a change that only checks for libffi in that case.
Regards, David
On 08/03/21 19:37, David Matthews wrote:
On 08/03/2021 17:29, Phil Clayton wrote:
I note that the binaries using the compiled FFI still have a dependency on libffi.? Presumably configure.ac is yet to be updated to make the checks on libffi conditional.? (I rebuilt with libffi configuration removed from 'configure' as per the attached diff and there was no dependence on libffi, and the tests, at least on Linux, were still fine.)
Thanks for that.? libffi is still needed in the interpreted version so I've pushed a change that only checks for libffi in that case.
That works for me as expected.
Regards, Phil