Yue,
Yue Li wrote:
Right now I'm studying the Poly/ML's interpreter, trying to understand the semantics of opcodes implemented in interpret.cpp. However, I have difficulty on understanding some of the opcodes:
- INSTR_non_local, in its implementation
Here what does this instruction do, and what does t[-1] holds? Also how about the semantics of INSTR_non_local_l_i (where i can be 1, 2 or 3)?
In general ML functions with references to free (non-local) variables need to have closures created to contain the free variables. Closures are tuples on the heap whose first word is the address of the code of the function. Because they are on the heap it's better to avoid creating them if that's possible and the optimiser looks for local functions that are only ever called within the scope of their declaration and uses a static-link calling convention instead of the closure-convention for them. INSTR_non_local and the short-form versions of it are used to access non-local variables in static-link functions by following the static chain to the appropriate level and then extracting the variable from the stack frame.
- INSTR_const_addr_Xb, and INSTR_const_addr_Xw: in their
implementations they have long expression for computing the addresses such as: I'm confused about what does the address computed by the long expression refer to and why it is computed in this way? What are the meanings of Xb, and Xw in the their instruction names?
A Poly/ML code segment is an item on the heap and consists of the code as a sequence of bytes followed by a sequence of words. The words are the constants used in the code and because they may be addresses they have to be placed somewhere that the garbage collector can find them. Actually, on the X86 constants can now be placed within the code itself but that's not relevant in the interpreted version. The original way of accessing a constant was INSTR_const_addr which simply has a byte offset to the particular constant required. INSTR_const_addr_Xb and INSTR_const_addr_Xw, which differ only in the way their arguments are encoded, were added during the process of porting Poly/ML from 32-bit to 64-bits. They have two arguments, a byte count which identifies the start of the constant area and a word count which identifies the particular constant within the constant area. INSTR_const_addr_Xb and INSTR_const_addr_Xw could probably be removed now.
- INSTR_ldexc. This instruction pushes the content in p_reg[0] into
the stack. I read the definition of the stack object in globals.h, but still couldn't understand what is the stack registers used for?
INSTR_ldexc is only ever used at the start of an exception handler. It pushes the exception packet (exception argument) that was used in the "raise". See INSTR_raise_ex. The p_reg[n] entries are used in the other code-generators to save the values of register contents. The interpreter is really a stack machine so this is the only place that a "register" is used in the interpreted version.
Regards, David.