Apologies for my radio silence on this; got preempted by some other pressing things.
I'm still in the dark about the Data61 experiments and have no details to add there. My own pressing need for this has evaporated so I'm probably unlikely to implement this unless someone else urgently needs it. Please speak up if you do. Some further replies to Makarius inline below.
On 17 March 2016 at 09:58, Makarius <makarius at sketis.net> wrote:
Overall I don't quite see a benefit to spend a lot of run-time and complexity of the implementation to compress a few GB. Even a small SSD has already 64-128 GB.
My machine in question at the time had a 128GB SSD which was always on the brink of being full. Anything that gained me >500MB was most welcome. Whether this is a niche scenario, I do not know.
What I often see is that a 32bit poly process is unable to save a heap due to memory shortage. Adding compression in the same process address space would probably make the situation worse.
Quite possible. I do not know as I exclusively run 64-bit Poly.
Note that for the present Isabelle setup, it is trivial to add heap compression outside the poly process, operating on the already saved heap file. What remains is the problem of loading it again without decompressing into a separate file.
Yes, you're right. I would be open to this solution as well, but as I'm more proficient in C++ than Scala or ML I looked to the PolyML source first.
Anyway, what is the state of more detailed measurements of the application?
It should be also worthwhile to step back it bit, and ask why the heaps are so large. There might be some Isabelle/ML programming mishaps in the application with persistent values that refer to Context.theory instead of Context.theory_id. Standard containers like the Simplifier or Classical context take care of that, but add-ons might not.
You are almost certainly correct. L4.verified is large repository of research artefacts. Many things in there have been written in haste by single authors without code review. Some of the authors are juniors or people like myself with a comparatively shallow exposure to Isabelle, and we don't always make the best implementation choices. I expect these details will not be surprising to anyone from an academic background. It would be nice to do a bit more tuning, but there are only so many hours in the day.