Hello Oskar, I'm copying this to the mailing list since I think it may be of wider interest.
Thank you for investigating the problem. I've looked into this and found that there is problem with mprotect sometimes failing and this seems to be a bug in Mac OS. I've pushed a fix but I wasn't able to test your specific example so maybe you can try it and let me know if the fix has worked.
Mac OS has introduced a restriction that prevents a thread from simultaneously having write and execute access to a page. This seems to affect ARM code but it doesn't look as though it's enabled with X86 code under Rosetta. Inevitably, being Apple, they have taken a different approach from SELinux and OpenBSD so I've had to add special code when allocating areas of memory for code in Mac OS on ARM. It looks as though mprotect on Mac OS can return an error if it is called a second time with the same arguments. That looks like a kernel bug to me. The problem is still there in the latest kernel, 20.3.0. I've worked around it by ignoring any error from mprotect since in this particular case the previous protection state is fine.
David
On 31/10/2021 13:45, Oskar Abrahamsson wrote:
Hello David,
Here is a bug that manifests itself with a Poly/ML compiler built on/for macOS-arm64, but not when the same compiler is built using Rosetta on the same machine. Here is the output from 'poly -v' for these compilers:
? Poly/ML 5.9 Release ? ?RTS version: Arm64-5.9 (Git version v5.8.2-324-g960de0cd) ? Poly/ML 5.9 Release ? ?RTS version: X86_64-5.9 (Git version v5.8.2-324-g960de0cd)
Both compilers are built in the same way (./configure, then make, then make compiler) but with all calls prefixed with ?arch -x86_64? for the Rosetta compiler.
Here is the bug: polyc dies when attempting to build src/poly-mlyacc.ML from this code: https://github.com/HOL-Theorem-Prover/HOL/tree/develop/tools/mlyacc https://github.com/HOL-Theorem-Prover/HOL/tree/develop/tools/mlyacc. To reproduce, go to src/ and run 'polyc poly-mlyacc.ML'. ?Here is the failure:
? ? Exception- Fail "Insufficient Memory" raised
I managed to find a failing call to mprotect which leads to the message above being shown during export, by looking at output generated from polyc when called with the --debug saving and --debug memmgr flags. Here are the last few lines of that output:
? ? SAVE: Allocated graveyard for permanent space, 0x138008000 size: 4340064. ? ? SAVE: Allocated graveyard for permanent space, 0x138430000 size: 7542328. ? ? SAVE: Allocated graveyard for permanent space, 0x138b68000 size: 510648. ? ? SAVE: Allocated graveyard for permanent space, 0x138be8000 size: 344120. ? ? SAVE: Copyscan default sizes: Immutable: 1048576, Mutable: 1048576, Code: 1048576, No-overwrite 4096. ? ? MMGR: New export immutable space 0x600001004000, size=1024k words, bottom=0x106aec000, top=0x1072ec000 ? ? MMGR: New export immutable space: insufficient space ? ? SAVE: Unable to allocate export space, size: 1048576. ? ? Exception- Fail "Insufficient Memory" raised ? ? MMGR: Deleted stack space 0x600001800dc0 at 0x1092ec000 size 262144 ? ? MMGR: Deleted stack space 0x600001804000 at 0x10559c000 size 2048
The ?MMGR: New export immutable space: insufficient space? message seems to appear because space->bottom == 0 at line 441 in libpolyml/memmgr.cpp,?and the reason why space->bottom == 0 is because the call to mprotect at line 330 in libpolyml/osmemunix.cpp fails with ENOACCES.
If there is any other information that I can provide that would be more helpful, please let me know.
? Oskar
On 30 Oct 2021, at 09:50, David Matthews <David.Matthews at prolingua.co.uk <mailto:David.Matthews at prolingua.co.uk>> wrote:
I'm intending to release the current master on github as version 5.9 in the near future. ?Could I ask everyone to give it a try and let me know if there are any serious bugs that need to be fixed. ?The main differences are the ARM64 code-generator, the new bootstrap process and position-independent code. This was described in greater detail back in May http://lists.inf.ed.ac.uk/pipermail/polyml/2021-May/002451.html http://lists.inf.ed.ac.uk/pipermail/polyml/2021-May/002451.html . There have also been other smaller changes and fixes.
David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk <mailto:polyml at inf.ed.ac.uk> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
Hello David,
Unfortunately the fix has caused the arm64 bootstrap to die during its 6th stage on my machine. Here is an example of this error:
Making Lex Making LEX_ Making SymbolsSig Created signature SymbolsSig Created functor LEX_ Making Pretty Created structure Pretty Making Symbols Created structure Symbols Making Debug /bin/sh: line 1: 47319 Bus error: 10 ./polyimport ./bootstrap/bootstrap64.txt -I . < ./bootstrap/Stage1.sml make[2]: *** [polyexport.o] Error 138 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
This happens after running ./configure and then make in a freshly checked out repository. It also seems to fail at different places in the 6th bootstrap stage (but always during that stage) if I run make again.
I also attempted building the compiler with arch -x86_64, and that works fine.
? Oskar
On 1 Nov 2021, at 14:54, David Matthews <David.Matthews at prolingua.co.uk> wrote:
Hello Oskar, I'm copying this to the mailing list since I think it may be of wider interest.
Thank you for investigating the problem. I've looked into this and found that there is problem with mprotect sometimes failing and this seems to be a bug in Mac OS. I've pushed a fix but I wasn't able to test your specific example so maybe you can try it and let me know if the fix has worked.
Mac OS has introduced a restriction that prevents a thread from simultaneously having write and execute access to a page. This seems to affect ARM code but it doesn't look as though it's enabled with X86 code under Rosetta. Inevitably, being Apple, they have taken a different approach from SELinux and OpenBSD so I've had to add special code when allocating areas of memory for code in Mac OS on ARM. It looks as though mprotect on Mac OS can return an error if it is called a second time with the same arguments. That looks like a kernel bug to me. The problem is still there in the latest kernel, 20.3.0. I've worked around it by ignoring any error from mprotect since in this particular case the previous protection state is fine.
David
On 31/10/2021 13:45, Oskar Abrahamsson wrote:
Hello David, Here is a bug that manifests itself with a Poly/ML compiler built on/for macOS-arm64, but not when the same compiler is built using Rosetta on the same machine. Here is the output from 'poly -v' for these compilers: Poly/ML 5.9 Release RTS version: Arm64-5.9 (Git version v5.8.2-324-g960de0cd) Poly/ML 5.9 Release RTS version: X86_64-5.9 (Git version v5.8.2-324-g960de0cd) Both compilers are built in the same way (./configure, then make, then make compiler) but with all calls prefixed with ?arch -x86_64? for the Rosetta compiler. Here is the bug: polyc dies when attempting to build src/poly-mlyacc.ML from this code: https://github.com/HOL-Theorem-Prover/HOL/tree/develop/tools/mlyacc https://github.com/HOL-Theorem-Prover/HOL/tree/develop/tools/mlyacc. To reproduce, go to src/ and run 'polyc poly-mlyacc.ML'. Here is the failure: Exception- Fail "Insufficient Memory" raised I managed to find a failing call to mprotect which leads to the message above being shown during export, by looking at output generated from polyc when called with the --debug saving and --debug memmgr flags. Here are the last few lines of that output: SAVE: Allocated graveyard for permanent space, 0x138008000 size: 4340064. SAVE: Allocated graveyard for permanent space, 0x138430000 size: 7542328. SAVE: Allocated graveyard for permanent space, 0x138b68000 size: 510648. SAVE: Allocated graveyard for permanent space, 0x138be8000 size: 344120. SAVE: Copyscan default sizes: Immutable: 1048576, Mutable: 1048576, Code: 1048576, No-overwrite 4096. MMGR: New export immutable space 0x600001004000, size=1024k words, bottom=0x106aec000, top=0x1072ec000 MMGR: New export immutable space: insufficient space SAVE: Unable to allocate export space, size: 1048576. Exception- Fail "Insufficient Memory" raised MMGR: Deleted stack space 0x600001800dc0 at 0x1092ec000 size 262144 MMGR: Deleted stack space 0x600001804000 at 0x10559c000 size 2048 The ?MMGR: New export immutable space: insufficient space? message seems to appear because space->bottom == 0 at line 441 in libpolyml/memmgr.cpp, and the reason why space->bottom == 0 is because the call to mprotect at line 330 in libpolyml/osmemunix.cpp fails with ENOACCES. If there is any other information that I can provide that would be more helpful, please let me know. ? Oskar
On 30 Oct 2021, at 09:50, David Matthews <David.Matthews at prolingua.co.uk <mailto:David.Matthews at prolingua.co.uk>> wrote:
I'm intending to release the current master on github as version 5.9 in the near future. Could I ask everyone to give it a try and let me know if there are any serious bugs that need to be fixed. The main differences are the ARM64 code-generator, the new bootstrap process and position-independent code. This was described in greater detail back in May http://lists.inf.ed.ac.uk/pipermail/polyml/2021-May/002451.html http://lists.inf.ed.ac.uk/pipermail/polyml/2021-May/002451.html . There have also been other smaller changes and fixes.
David _______________________________________________ polyml mailing list polyml at inf.ed.ac.uk <mailto:polyml at inf.ed.ac.uk> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
Hello Oskar, I've had another look and it seems that sometimes mprotect fails on previously unused areas leaving the memory unwritable. There doesn't seem to be any logic to it so instead the whole of the code region is allocated at the start. That could result in the poly process requiring a large swap space at the start but there doesn't seem to be any alternative. This is in commit c92c335.
David
On 01/11/2021 14:22, Oskar Abrahamsson wrote:
Hello David,
Unfortunately the fix has caused the arm64 bootstrap to die during its 6th stage on my machine. Here is an example of this error:
Making Lex Making LEX_ Making SymbolsSig Created signature SymbolsSig Created functor LEX_ Making Pretty Created structure Pretty Making Symbols Created structure Symbols Making Debug /bin/sh: line 1: 47319 Bus error: 10 ./polyimport ./bootstrap/bootstrap64.txt -I . < ./bootstrap/Stage1.sml make[2]: *** [polyexport.o] Error 138 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
This happens after running ./configure and then make in a freshly checked out repository. It also seems to fail at different places in the 6th bootstrap stage (but always during that stage) if I run make again.
I also attempted building the compiler with arch -x86_64, and that works fine.
? Oskar
Hello David,
With this fix, the bootstrap now completes on arm64 and I get a compiler executable. I was able to build the previously failing example (poly-mlyacc.ML from HOL4). However: the bootstrap no longer works under Rosetta:
Making all in . ./polyimport ./bootstrap/bootstrap64.txt -I . < ./bootstrap/Stage1.sml Use: basis/build.sml Use: basis/InitialBasis.ML /bin/sh: line 1: 2010 Bus error: 10 ./polyimport ./bootstrap/bootstrap64.txt -I . < ./bootstrap/Stage1.sml
I?m also experiencing a new failure when building the HOL4 base theories, where HOL4 (or rather, its own variant of make called ?Holmake?) fails with a SIGSEGV. I don?t know what Holmake is doing when this failure occurs; I can?t get the --debug flags show anything. Given that it?s a segfault it seems possible that this error is related, even if this is the first time I have managed to run the arm64 compiler on this code.
? Oskar
On 1 Nov 2021, at 17:07, David Matthews <David.Matthews at prolingua.co.uk> wrote:
Hello Oskar, I've had another look and it seems that sometimes mprotect fails on previously unused areas leaving the memory unwritable. There doesn't seem to be any logic to it so instead the whole of the code region is allocated at the start. That could result in the poly process requiring a large swap space at the start but there doesn't seem to be any alternative. This is in commit c92c335.
David
On 01/11/2021 14:22, Oskar Abrahamsson wrote:
Hello David, Unfortunately the fix has caused the arm64 bootstrap to die during its 6th stage on my machine. Here is an example of this error: Making Lex Making LEX_ Making SymbolsSig Created signature SymbolsSig Created functor LEX_ Making Pretty Created structure Pretty Making Symbols Created structure Symbols Making Debug /bin/sh: line 1: 47319 Bus error: 10 ./polyimport ./bootstrap/bootstrap64.txt -I . < ./bootstrap/Stage1.sml make[2]: *** [polyexport.o] Error 138 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 This happens after running ./configure and then make in a freshly checked out repository. It also seems to fail at different places in the 6th bootstrap stage (but always during that stage) if I run make again. I also attempted building the compiler with arch -x86_64, and that works fine. ? Oskar
Hello Oskar,
On 01/11/2021 18:01, Oskar Abrahamsson wrote:
With this fix, the bootstrap now completes on arm64 and I get a compiler executable. I was able to build the previously failing example (poly-mlyacc.ML from HOL4). However: the bootstrap no longer works under Rosetta:
I've had a look at this and pushed a fix so that the bootstrap now builds with both native ARM64 code and X86+Rosetta. The MAP_JIT option to mmap is needed in ARM code as part of the way of getting round the write+execute problem but seems to have a different effect in X86+Rosetta. It isn't actually needed in X86 code so it's now no longer used.
I?m also experiencing a new failure when building the HOL4 base theories, where HOL4 (or rather, its own variant of make called ?Holmake?) fails with a SIGSEGV. I don?t know what Holmake is doing when this failure occurs; I can?t get the --debug flags show anything. Given that it?s a segfault it seems possible that this error is related, even if this is the first time I have managed to run the arm64 compiler on this code.
It's quite possible there's a bug in the ARM code generator.
David