On 27/02/2017 22:55, Makarius wrote:
On 24/02/17 13:26, Makarius wrote:
We have strange memory management problems with Isabelle.
(1) Poly/ML pre-5.7 requires a bit more heap space (tested for HOL, HOL-Library, HOL-Analysis), especially when running in parallel. This might explain the failure building main HOL on the small macOS machine: it probably runs into a situation where heap compaction no longer works and the ML process interrupts all threads.
To address that, I have fine-tuned the defaults for parallel proofs, leading up to Isabelle/05f1b5342298, with some hope that it might improve the situation in general: less waste of CPU and memory on small machines.
(2) The big problem is AFP/Iptables_Semantics_Examples. Here is the timing with Poly/ML 5.6 on lxbroy10, an old AMD machine that is relatively slow:
Finished Iptables_Semantics_Examples (4:35:13 elapsed time, 23:35:08 cpu time, factor 5.14)
With Poly/ML 5.7 it quickly runs into memory problems. See the included PNG images for comparison.
That session is definitely quite ambitious, not to say insane. It contains lots of invocations of the "eval" proof method, which means dynamic compilation of ML from HOL specifications, and throwing away the result.
My current guess is that it is a problem of compiled ML code that is no longer garbage-collected.
Is there a way to measure the size of this persistent ML code?
Is there a way to invoke the Poly/ML compiler in interpreted mode via some flag?
The current version does garbage collect code; it's just that it only happens at a major GC. Previously when code was part of the general heap a short-lived code cell would be garbage-collected by a minor GC. It could be that this is problem or it could be that there is a bug which is resulting in code not being properly collected.
It isn't possible to use the current byte-code interpreter with the machine-code version. The idea I had for interpretation was to interpret at the level of the code-tree, essentially one step on from the parse-tree. Each instruction of top-level code without a loop is executed only once and so it's a complete waste of resources to optimise it and generate machine code, but that's what happens at the moment. It should be fairly easy to add an interpretation step since something like that already happens for expressions involving constants.
I'm inclined to release the current version of git master as 5.7 and then look into this and other issues.
David