Attempting to run the CakeML CI test sequence (a 2-3 day process) on any Poly/ML
version newer than 5.7 frequently results in out of memory errors. The probability
of any given test failing is low and seems very sensitive to environmental factors,
and I am still trying to reliably reproduce the failure in any setting, but I have
managed to generate --debug gc --debug heapsize logs from failures (attached).
The log file is from v5.8.1 but I have seen the issue on several different HEAD
revisions over the past month.
The "Run out of store - interrupting threads" message in the middle of a block of
GC output makes me suspect a race condition but otherwise I have little to go on
here. Any advice would be appreciated. I'll update if I find anything.
The machine has 256GB installed and I generally run tests with --maxheap 75000,
so a failure with a heap size of only 2GB is quite odd.
-s