-
Notifications
You must be signed in to change notification settings - Fork 17
Object Format
This article describes the layout in memory of all ML data objects in MLWorks.
Every boxed value is 32 bits. The C type mlval is a boxed value that might refer to a heap object, depending on its tag.
Every object is 8-byte aligned, giving us 3 "primary tag" bits at the
bottom of every pointer. Boxed integers are 30 bits, with 2 zero tag
bits (accounting for primary tag values 0 and 4). Here are the tag
values (see rts\src\values.h):
| Bits | Tag name | Description |
|---|---|---|
| 0 0 0 | INTEGER0 | Even integer. |
| 0 0 1 | PAIRPTR | Pair. |
| 0 1 0 | HEADER | Header word. |
| 0 1 1 | REFPTR | Reference pointer. |
| 1 0 0 | INTEGER1 | Odd integer. |
| 1 0 1 | POINTER | Any other pointer. |
| 1 1 0 | PRIMARY6 | Unused (but even so not a pointer). |
| 1 1 1 | PRIMARY7 | Unused (but odd, so a pointer). |
Pairs (2-tuples) do not have a header word. Every other object has a header word. Note that POINTER = PAIRPTR+4, so loading from tuples is the same for header words as for other values. The ML type system means that the compiler always knows what the tag bits are when generating code to dereference a pointer, so can apply a fixed offset (which is free on most architectures).
There are a lot of useful macros for runtime C code to produce and manipulate these boxed objects:
| Macro | Meaning |
|---|---|
PRIMARY(x) |
the primary tag value of x. |
MLINT(cint) |
the boxed ML integer with the value cint. |
CINT(mlint) |
the value of the ML integer mlint. |
MLPTR(t,p) |
the pointer p with primary tag t. |
OBJECT(x) |
a C pointer to the header of the boxed object x. |
MLVALISPTR(x) |
true iff x is a pointer. |
ISORDPTR(x) |
true iff x is a POINTER or PAIRPTR. |
ISREFPTR(x) |
true iff x is a REFPTR (or a PRIMARY7 value). |
The bottom 6 bits of every header word is a "secondary tag", which
identifies the runtime type of the object, and the rest of the header
word is the length field. Here are the secondary tag values:
| Tag bits | Tag name | Description |
|---|---|---|
| 0 0 0 0 1 0 | RECORD | Tuple with length slots: may contain pointers; simple equality. |
| 0 0 1 0 1 0 | STRING | String with length bytes: contains no pointers; simple equality. |
| 0 1 0 0 1 0 | ARRAY | Updatable tuple with length slots;, pointers; ref equality. |
| 0 1 1 0 1 0 | BYTEARRAY | Updatable tuple with length bytes; no pointers, ref equality. |
| 1 0 0 0 1 0 | BACKPTR | "Back pointer" (see below). |
| 1 0 1 0 1 0 | CODE | Code object of length words; no pointers, ref equality. |
| 1 1 0 0 1 0 | HEADER50 | Unused. |
| 1 1 1 0 1 0 | WEAKARRAY | Weak array with length slots: special pointers, ref equality. |
Note that particular secondary tag bits are used to convey some information. Bit 3 indicates to the garbage collector that the object contains no fixable pointers. Bit 4 indicates that polymorphic equality on the object is determined by its address (rather than its contents). Bit 5 indicates that the object is "unusual".
The only boxed pointers permitted into the interior of an object are
"shared closures" and "back pointers". A "shared closure" object is a
tuple with code pointers in even-numbered slots (0, 2, 4, ...) and
zero words in odd-numbered slots (1, 3, 5, ...). A pointer may point
to any even-numbered slot, with primary tag POINTER. Thus it will
have a zero word where it would expect a header word.
A "back pointer" has primary tag POINTER and a header word with
secondary tag BACKPTR. The length field of the header word is
the offset (in bytes) of the back pointer header word from the
object's header word. Back pointers are used, specifically, for
pointers to code objects which form part of a "code vector" (of which
more below).
Again the runtime has a number of useful macros to produce and manipulate secondary tags:
| Macro | Meaning |
|---|---|
SECONDARY(x) |
the secondary tag of a header word x. |
LENGTH(x) |
the length field of a header wqord x. |
EQUALITY(x) |
true iff the object with header x has simple equality. |
POINTERS(x) |
true iff the object with header x contains pointers. |
FIXABLE(x) |
true if x is a header word and contains pointers. |
ISNORMAL(x) |
true iff the object with header x is ordinary. |
MAKEHEAD(t,l) |
a header word with secondary tag t and length l. |
DEREF(x) |
the value of the first word in an array. |
FOLLOWBACK(x) |
a boxed pointer to the object into which x is an internal pointer. |
TRUE_HEAD(x) |
a pointer to the first word of the object to which x points. |
OBJECT_SIZE(s,l) |
the size in bytes of an object with secondary tag s and length l. |
DATA_SIZE(s,l) |
the size in bytes of user data in such an object. |
The only floating-point numbers supported by MLWorks, of the SML type real, are 64-bit (double-precision) IEEE-754 floats. A single such value is implemented as a POINTER to a BYTEARRAY containing 4 bytes of padding followed by the 8 byte float value itself (making 16 bytes including the header). The padding is required to align the float on an 8-byte boundary, which is necessary on some processor architectures.
MLWorks also supports arrays of real (as used, for instance, in the SML Basis Library structure RealArray). These are implemented as larger BYTEARRAY values, with 4 bytes of padding and a total size of 8N+4+4 bytes.
Every ML function is represented by a closure, which is a tuple. The zeroth slot in that tuple is a code item pointer, which actually points to the machine code generated for the function. When a function is called, its boxed closure is passed in a register, and the function code accesses all values referred to in the body of the function (aside from arguments) via slots in the closure.
For shared closures (see above), there may be other code item pointers in subsequent even-numbered slots (separated by zeroes in the odd-numbered slots). Thus a pointer into the interior of this same tuple (to one of the code item pointer slots) can be used as the boxed closure for each function.
All the code generated by the compiler is allocated into "code
vectors" (secondary tag CODE). A code vector has a single boxed
slot at the beginning (immediately after the header word) pointing to
the "ancillary record", and the a number of "code items". In this
way, the code for several distinct functions which share other closure
values can be generated into a single code vector, with a shared
closure.
Each code item has a BACKPTR header, and a single word of metadata
(the "ancillary word", which is a terrible name because it's easy to
confuse with the "ancillary record"), followed by the machine code
itself. A pointer to code (for example, in a closure) points to one
of these code items, so the first instruction of the machine code is
at an offset of 3 from the code pointer (-5 for the POINTER tag,
+4 for the header word, +4 for the ancillary word).
The ancillary word has several fields. From the least-significant bits up, this table shows field sizes on each architecture:
| Name | SPARC | MIPS | x86 | Description |
|---|---|---|---|---|
| number | 10 | 10 | 10 | The index of this item in the code vector. |
| intercept | 5 | 6 | 8 | An offset to code to replace to intercept this function. |
| leaf | 1 | 1 | 1 | A single bit flag denoting a leaf function. |
| saves | 0 | 5 | 2 | The number of callee-save registers. |
| args | 0 | 0 | 3 | The number of arguments passed on the stack. |
| non_gcs | 16 | 10 | 8 | The number of non-GCable spill slots in the stack frame. |
There is a lot of bit pressure in this word, especially on x86 (where shortage of register pressure means that the values in the 'saves', 'args', and 'non_gcs' slots are often all non-zero, and variable-length instructions means that the 'intercept' slot needs to be larger). The sizes of the different fields have been carefully tweaked.
The intercept slot is an offset into a block of NOP instructions in the code, when compiled for interception. These instructions can be replaced to intercept function calls. This is used for tracing, debugging, space profiling, call-counting, etc. The units in which the intercept offset is counted are bytes on x86 and 32-bit words on SPARC and MIPS.
The ancillary vector has between one and three slots, depending on the compilation switches. Each slot is a boxed table indexed by code item number. The tables are as follows, in order:
| Table | Description |
|---|---|
| names | An ML tuple of ML strings. |
| profile records | An ML tuple of pointers to C structures used when profiling: these are not allocated on the ML heap so the GC can treat them as non-pointers. |
| intercept fns | An ML array of ML functions. Note that this gets updated, so it does actually have to be an array. |
There are C runtime macros for extracting and manipulating all this:
| Macro | Meaning |
|---|---|
CCODEANCILL(p) |
the ancillary field of the code item. |
CCODENUMBER(p) |
the code index of the code item. |
CCODENONGC(p) |
the number of non-gc values. |
CCODEARGS(p) |
the number of stack parameters. |
CCODESAVES(p) |
the number of callee saves. |
CCODELEAF(p) |
the leaf flag. |
CCODEINTERCEPT(p) |
the intercept offset |
CCODE_CAN_INTERCEPT(p) |
is the function interceptable? |
CCODENAME(p) |
the code name (an ML string) |
CCODEPROFILE(p) |
the profile record |
CCODE_SET_PROFILE(p,v) |
set the profile record |
CCODEINTERFN(p) |
the intercept function |
CCODE_SET_INTERFN(p,f) |
set the intercept function |
CCODESTART(p) |
the address of the first instruction. |
An array (primary tag REFPTR, secondary tag ARRAY) has two
words after the header word which are used by ML code and the garbage
collector to manage "remembered sets". The words are slots
(forward and back) for a doubly-linked list.
Whenever an array is created or updated, it's put on the "modified
list", a singly-linked list linked through the back slots. Each
generation has an "entry list": a doubly-linked list of arrays which
should be scanned for pointers to this or any older generation.
Note that ML references are single-element arrays, and therefore occupy 4 words: header, forward and back slots, and reference.