[polyml] How to build a new backend?

24 Nov 2023


      The build system uses PolyML.make not the usual make system.  There is a 
short description of it here: 
https://polyml.org/documentation/Reference/PolyMLMake.html .  The 
backends are built as a result of this line in 
mlsource/MLCompiler/CodeTree/ml_bind
structure GCode         = GCode
This looks for a file called GCode.xxxx where xxxx is an extension such 
as .ML or .sml .  An undocumented feature is that it first looks for 
files using the current architecture as returned from the RTS by 
PolyML.architecture() after converting this to lower case.  So on the 
X86_64 it finds GCode.x86_64.ML and uses that.  There are various 
GCode.xxxx files for the different architectures.  To add a new 
architecture you need to add a new string to the poly_dispatch_c and the 
appropriate GCode.foo.ML file.
A further complication is that Poly/ML does not work in the same way as 
a more conventional compiler which compiles a file and write the object 
code for that file to the file system.  Instead the compilation process 
builds a data structure much of which is executable code but including 
other values.  At the end of the build process the structure is 
"exported" and that writes the object file.  As PolyML.make runs each 
expression is compiled and immediately evaluated meaning that some of 
the code produced by the compiler is run immediately.  Obviously if the 
evaluation produces a function that function is only evaluated when it 
is actually called.  This make conventional cross-compilation difficult 
or impossible.  The bootstrap process, which starts with interpreted 
code and ends up with machine code, has to work around this.  It does it 
by first building a version of the interpreted code that has additional 
instructions in the code.  These instructions are machine instructions 
on the target architecture that switch to the interpreter but are 
treated as no-ops by the interpreter itself.  In this way during the 
next stage of bootstrap machine code functions can call interpreted code 
functions and vice versa.  When the bootstrap is complete all the 
interpreted code is discarded.
David
On 23/11/2023 15:07, Andrei Formiga wrote:
...
Hi David,
Thank you for your answer. You're right - I have to understand a lot more
in order to be able to create a new backend. I may have many more
questions.
I guess the first one is: for a rebuild of the compiler (from the last
bootstrap stage, not from scratch), how does the build system find out
which files to compile, and in what order?
On Thu, Nov 23, 2023 at 4:12?AM David Matthews <
David.Matthews at prolingua.co.uk> wrote:
...
Hi Andrei,
It would be interesting to have another back-end but I really don't
think what you are suggesting is feasible.  There are currently three
back-ends: native code for the X86(32/64), native code for the ARM64 and
byte code.  The byte code is interpreted by part of the run-time system
and is used on architectures other than the X86 and ARM64 but it is also
used during the initial bootstrap on the X86 and ARM64.
Apart from a small amount of architecture-specific code, and of course
the interpreter in C++ for the byte code, all these back-ends make use
of the same run-time system support.  The run-time system is intimately
bound up with the ML part of the system.  They share a common view of
how values are represented: short integers are tagged, addresses are not
tagged, strings have a length word followed by byte data etc.  Any new
back-end has to maintain these representations.  Before you even think
about writing a new back-end you need to understand how all this works.
David

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[polyml] How to build a new backend?