[polyml] Reducing the size of on-disk saved state

31 Mar 2016


      Apologies for my radio silence on this; got preempted by some other
pressing things.
I'm still in the dark about the Data61 experiments and have no details
to add there. My own pressing need for this has evaporated so I'm
probably unlikely to implement this unless someone else urgently needs
it. Please speak up if you do. Some further replies to Makarius inline
below.
On 17 March 2016 at 09:58, Makarius <makarius at sketis.net> wrote:
...
Overall I don't quite see a benefit to spend a lot of run-time and
complexity of the implementation to compress a few GB. Even a small SSD has
already 64-128 GB.
My machine in question at the time had a 128GB SSD which was always on
the brink of being full. Anything that gained me >500MB was most
welcome. Whether this is a niche scenario, I do not know.
...
What I often see is that a 32bit poly process is unable to save a heap due
to memory shortage. Adding compression in the same process address space
would probably make the situation worse.
Quite possible. I do not know as I exclusively run 64-bit Poly.
...
Note that for the present Isabelle setup, it is trivial to add heap
compression outside the poly process, operating on the already saved heap
file. What remains is the problem of loading it again without decompressing
into a separate file.
Yes, you're right. I would be open to this solution as well, but as
I'm more proficient in C++ than Scala or ML I looked to the PolyML
source first.
...
Anyway, what is the state of more detailed measurements of the application?
It should be also worthwhile to step back it bit, and ask why the heaps are
so large. There might be some Isabelle/ML programming mishaps in the
application with persistent values that refer to Context.theory instead of
Context.theory_id. Standard containers like the Simplifier or Classical
context take care of that, but add-ons might not.
You are almost certainly correct. L4.verified is large repository of
research artefacts. Many things in there have been written in haste by
single authors without code review. Some of the authors are juniors or
people like myself with a comparatively shallow exposure to Isabelle,
and we don't always make the best implementation choices. I expect
these details will not be surprising to anyone from an academic
background. It would be nice to do a bit more tuning, but there are
only so many hours in the day.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[polyml] Reducing the size of on-disk saved state