Cavalry takes a very simple (perhaps naive) approach to compiling java bytecode. It analyses the bytecode for a method and then does a fixed substitution for each bytecode in that method. Whereever a java bytecode refers to the java method stack, the translated code will use the native C stack.
Cavalry is itself written in Java, and relies on an x86 assembler also written in Java.
case op.astore_3: qc("mov eax,[ebp + 0x8]"); qc("add eax,0xc"); qc("pop edx"); qc("mov [eax],edx"); break;
For instructions that cause branching, cavalry supports the creation of labels, when the code is assembled it uses back-patching to fill in the return jumps. When an instruction has to do something complex, like invoke another method, cavalry calls a special handler written in C which uses the VM to perform the necessary function.
When methods have exception handlers, this code is also translated into native code. This involves checking the type of exceptions that the handlers handle.
Currently cavarly has no knowledge of registers and doesn't try to use them optimally at all. In fact, a sequence of instructions that pushes two numbers and then adds them, will result in many changes to the stack that aren't at all necessary.
I would like to develop a compiler that will create a flow structure for the bytecode, potentially peform some optimisation on that, and then emit the result in some register-transfer-language. It would be useful if it were easy to target multiple architectures (MIPS, Alpha etc).
I would also like to allow the compiler to incorporate C code in it's workings (ie load and compile C code), so that it can do a more global optimisation of methods that need help from the VM.
Currently method invocation and backwards branches are garbage collection points. Instructions which can cause new object allocation (and hence trigger garbage collection) are also garbage collection points.
I never finished this project, one of the reasons was that I didn't have garbage collection working with a separate GC thread. That is done now (March 2001) and I will continue developing this.
The new garbage collector and new scheme works (March 2001) currently it just implements a single heap. My plan is to have another smaller heap that is used exclusively for small objects (less than 50 words or 200 bytes). When these objects age they are promoted to the other heap.
Collection would be done more often on the smaller heap, ideally small short-lived objects would be allocated there. I also hope to implement a TLH (thread local heap) system, where each thread has exclusive access to a part of that smaller heap. This will allow object allocation without locking and unlocking a mutex, which should help for the allocation of small objects.
Currently I plan to make each TLH 375 words, that may be a bit small. It all depends on how many threads are running in the system. A thread that allocates many objects should have preference over threads that sleep all the time.
The store used to store objects on disk is not optimal, it would be nice to write a high-performance store for the persistence mechanism.
As of March 2001, we are running teaseme as a normal linux user process, a linux kernel module and with a custom kernel called RJK.
Our aim is to eventually write a general purpose operating system based on Java.