[polyml] extremely bad performance on some mllex generated code

3 Mar 2015

      Thanks for the detailed explanation!  It's great to know that this has been
resolved.
Best wishes,
Michael
On 04/03/15 00:45, David Matthews wrote:
...
An update on this.  I've committed some changes that seem to have solved the
problem.  It's probably of general interest so I'll provide a bit of background.
It has to do with the optimisation of closures.  In general an ML function needs
to be represented by a closure, an item on the heap consisting of a pointer to
the code and copies of the free variables needed by the function.  A naive
implementation would build a closure for each function as it was declared but
it's possible to do much better than that.  If we can be sure that a function is
only ever called and not returned or assigned to a reference we don't need to
put the closure on the heap.  Poly/ML has used a couple of different techniques
but more recently it has handled these cases by building the closure on the
stack.  A stack closure looks just like a heap closure so the calling
conventions are the same.  That means it can be used both when we can identify
all the call sites of the function but also if it is passed into a function such
as map or fold.  (Actually, List.map and List.foldl/r are small enough that
they're inlined but the principle is the same).  The disadvantage is that a call
to a function with a stack-closure cannot be tail-recursive because the closure
needs to stay on the stack until the function returns.  This is what went wrong
with this example.
Another solution, which works when all the call sites of a function can be
identified but not when a function is passed as an argument, is lambda-lifting. 
This involves adding the free variables as extra arguments to the function.  The
resulting function does not need a closure.  I've implemented this for local
functions where all the call sites can be found and left the stack closures for
the other cases.  It is this change that has fixed the problem since
lambda-lifted functions can be tail-recursive.
It may be that lambda-lifted functions are more efficient for other reasons. 
Functions without closures can be code-generated using calls or jumps directly
to the function rather than requiring an indirection through the closure.  This
may fit better with the pre-fetching of modern hardware.
David
On 05/02/2015 05:00, Michael Norrish wrote:
...
(I'll attempt to attach the tgz file to this message, but if 817KB is too big
and this bounces or the attachment is dropped, I'll make it available
separately.)
If compiled with Poly/ML, the attached program performs abysmally on the
provided testcase.sml input file.  Within seconds of beginning, it attempts to
allocate more memory than my machine has (16GB), and basically brings it to its
knees.  If compiled with mlton, the testcase doesn't seem to cause any obvious
memory allocation, and finishes in less than a second.
You can also compile with Moscow ML, though I haven?t included the build
instructions in the Makefile.  The execution of the mosml executable doesn't
seem to allocate any memory either, and it terminates cleanly (after 17s).
Though slow this is still preferable to Poly/ML's behaviour.
This is with Poly/ML 5.5.2 on
...
Linux telemachus 3.2.0-75-generic #110-Ubuntu SMP Tue Dec 16 19:11:55 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux
The Makefile generates two executables (if you have mlton and poly in your
PATH): pholdeptool and mholdeptool.  You can run them on the testcase, e.g.:
 pholdeptool testcase.sml

The lexer code looks to me to be tail-recursive, and the program only
accumulates a binary tree containing 15 strings.  So it really does seem as if
Poly/ML is doing something buggy.
(The program is lexing the test-case (which uncompresses to a 30MB file) in an
extremely simple way, but performing a useful analysis.)
Michael

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[polyml] extremely bad performance on some mllex generated code