On 31/08/2025 13:54, vqn wrote:
I have found SML/NJ's CM to be effective at avoiding recompilation. This is not surprising as it is the result of considerable research - see the references in the CM User Manual (https://smlnj.org/doc/CM/new.pdf). This makes a difference in practice: for example, adding a new function to a utilities module on which every other module depends does not cause everything to be rebuilt. Although I have not used MLKit, it appears that its incremental compilation of MLB files would behave similarly, according to the documentation (https://elsman.com/mlkit/mlbasisfiles.html#managing-compilation-and-recompil...). Out of interest, did you consider implementing cut-off incremental recompilation for MLB files, in particular as described in Elsman's paper (https://elsman.com/pdf/sepcomp_tr.pdf)?
At a glance, both the CM's and MLKit's incremental recompilation seem to rely on the concept of an "exported interface" and only recompile a module when at least one of the free identifiers it depends on is part of such an interface which has changed.
As far as I can understand, this requires being able to
- extract free identifiers from a module;
- compare the content of two interfaces (not just their exported identifiers);
- link old compiled code to new code it depends on.
While (1) could probably be implemented through namespaces, I'm not sure how to go about (2) and (3), especially since I am only wrapping the compiler API. I.e limited to a single '(source code * compiled env) -> fully compiled and linked code' operation.
I think that is a nice summary of what would be needed. I strongly suspect that the interface provided by the PolyML structure does not allow this to be implemented, which would be a perfectly good reason for not doing it!
Though for now the problem is more how to properly (de)serialize compiled code so that it can be reused for subsequent compilations. :)
Yes. This issue seems (sort of) related to linking names in old and new code but not for the object code itself (where types are, presumably, long since eliminated) but the SML types associated with certain entities in the object code. Clearly I'm not familiar with the internals of Poly/ML compilation but I may take a closer look.
Phil