Cat Higher Order Instructions to Assembly
Here are my rough ideas on how assembly instructions might be generated for higher order instructions in Cat:Â
Concepts:
- A function_struct is a structure that contains executable code (whatever size is appropriate for easy stack allocation and manipulation)
- A return_register is a register reserved to hold return addresses
- RETURN is a macro for “push return_register, ret”Â
 Functions
- quote - pop a value x from the stack, push a function_struct containing the instructions “push x; return”.
- eval - pops top of stack into return_register and jumps to stack_top - sizeof(function_struct)
- [f] - creates a function called _quote_f that pushes a function_struct onto the stack that contains the following instructions “jmp f” where f is the address of the function f.
- compose -Â pops two function structs from the stack and creates a new one according to the following rules
- all “return” instructions are removed
- the remaining instructions are concatenated and placed in a new function struct which ends with a “return”
- if the total size is too small a heap allocated version is created and the function struct simply contains a “jmp” to the address of the heap allocated version.
What is left up to the imagination here is how to deal with heap allocated functions when they are no longer needed. A fair amount can be dealt with during static analysis, but in the end some kind of garbage collection scheme is going to be needed.
April 11th, 2007 at 10:04 pm
A problem I have with this scheme is that efficient compilation wants to do things like register allocation, strength reduction, etc., all that boring stuff that makes things fast and also requires a nice set of basic blocks for analysis. Basically, delaying compilation until as much information as possible is available.
CLR & JVM instructions can be viewed as encoding a series expression DAGs that can be passed whole to the compiler and be treated very similarly to a traditional back end, but these new instructions can’t be as easily, as far as I can see. They’d need two different representations:
1) preserving the old, DAG-based structure
2) creating code based on this branch of the DAG
… and either one or the other would be used whether a compose or an eval operation was invoked. Also, efficient JIT compilation, while fast, is not so fast that you’d want to do it every time you call. You’d want to cache what you can.
I don’t think it’s impossible, I know a lot of work has been done on e.g. Lisp implementations along these ideas to make them faster, but I wonder how much of a design shock that kind of approach would have on the typical JIT implementation.