I have found SML/NJ's CM to be effective at avoiding recompilation. This is not surprising as it is the result of considerable research - see the references in the CM User Manual (https://smlnj.org/doc/CM/new.pdf). This makes a difference in practice: for example, adding a new function to a utilities module on which every other module depends does not cause everything to be rebuilt. Although I have not used MLKit, it appears that its incremental compilation of MLB files would behave similarly, according to the documentation (https://elsman.com/mlkit/mlbasisfiles.html#managing-compilation-and-recompil...). Out of interest, did you consider implementing cut-off incremental recompilation for MLB files, in particular as described in Elsman's paper (https://elsman.com/pdf/sepcomp_tr.pdf)?
At a glance, both the CM's and MLKit's incremental recompilation seem to rely on the concept of an "exported interface" and only recompile a module when at least one of the free identifiers it depends on is part of such an interface which has changed.
As far as I can understand, this requires being able to
1. extract free identifiers from a module; 2. compare the content of two interfaces (not just their exported identifiers); 3. link old compiled code to new code it depends on.
While (1) could probably be implemented through namespaces, I'm not sure how to go about (2) and (3), especially since I am only wrapping the compiler API. I.e limited to a single '(source code * compiled env) -> fully compiled and linked code' operation.
Though for now the problem is more how to properly (de)serialize compiled code so that it can be reused for subsequent compilations. :)