An exotic flavor of coffee? A misspelling? Well, neither, actually. There's no need to keep you guessing: Kaffe is a complete Java implementation. And to top it off, its source code is freely available via the Gnu Public License. But that's just part of the story.
What does "complete" mean? Many people still think of Java as a language for Web-centric applications. That falls far short of its real meaning. Java is actually a whole, portable programming environment, providing services like file/network I/O, graphics, exception handling, memory management (with garbage collection), threading, persistency (by means of serialization), and much more. All these functions are wrapped up in portable abstractions, effectively moving the boundary of non-portable code out of the application and down into the Java implementation. As a consequence, each Java implementation resembles a kind of middleware operating system, consisting of at least three layers.
The central part of this picture is the JVM (Java Virtual Machine), but apart from performance enhancements, the focus has been shifted to the huge standard class libraries sitting on top of it.
How does Kaffe fit in? Of course, it includes a JVM, which has always had a JIT (Just in Time) compiler. It also has its own class libraries, including a Java-1.1-compatible AWT (Abstract Windowing Toolkit). The only part that is missing is a freely available Java bytecode compiler, but there are several javac alternatives to choose from (e.g., Jikes, Pizza or GJ).
Kaffe does not contain any third-party copyrights. It is a "cleanroom" implementation and was built from scratch.
Is it free or commercial?
Kaffe has been ported to more operating systems than any other form of Java.
There is also a commercial Kaffe version. Transvirtual Technologies Inc. provides the Custom Edition, based on the sources for which it holds the copyright. The two versions don't differ much with respect to functionality, with the notable exception of the Custom Edition's AWT, which does not require a native window system (like X or Win32). The Custom Edition has been ported to many non-Unix like operating systems (e.g., DOS, VxWorks, ThreadX, SMX, QNX, LynxOS), most of which find use in embedded systems such as set-top boxes or telecommunications equipment.
How do these versions coexist? The answer is: by means of separation and controlled flow.
Both versions have their own, separate CVS repositories. Most new components (like the new JIT) are still propagated from the Custom Edition into the Desktop, where they are exposed to a much larger user community. This unveils bugs and design flaws more rapidly. The flow back into the Custom Edition is not limited to patches; it also consists of useful input from mailing list discussions. No major contribution to the Desktop is incorporated into the Custom Edition without the permission of its author. There is no "pseudo-Open Source outsourcing" -- it is a classical win-win situation.
But why do we need another "Java"?Well, that's a bit of a misdirected question. Kaffe has been around since early 1996 (long before other JVMs had JITs, or tackled portability), and it still is the only Java implementation for a number of platforms. But there is more than just history. Besides the fact that the people involved in developing Kaffe started this project because they simply like to work on it (the fuel that drives Open Source), there are good technical reasons, as well.
An Open Source project like Kaffe has contributors all over the world.
Portability -- Everyone talks about the portability of Java applications, but nobody cares for the portability of Java implementations. Nobody? Well, actually, Kaffe is very aware of this. All platform-specific parts are encapsulated by means of interfaces, in order to keep them from proliferating through the whole system. The sources are organized so that platform-specific things (headers) are kept strictly apart from the portable kernel (e.g., class management), to be linked in by means of an automake-based build process (using automated tests to deduce mandatory platform characteristics). The JVM and native libraries are written in portable C. The public Desktop Edition has recently been adapted to use the libtool package (in order to sort out linkage differences). And last but not least, the amount of source code is kept as small as possible. The complete source tree of the Custom Edition (including all supported platforms) is still less than 4 MB (with about 2 MB of C code and 1.5 MB of Java code).
As a consequence, Kaffe has been ported to more operating systems than any other form of Java (from Linux to DOS) and runs on a number of different architectures (including i386, Alpha, m68k, MIPS, StrongARM, and PowerPC). Not all ports have all features (e.g., JIT, AWT libraries) or are at the same level of maturity, or have GPLed sources. But a typical port usually takes less than three months to get fully functional, with the JIT and the AWT being the most challenging pieces. Of course, one can always start with the simple interpreter and a non-graphical system, and build from there.
Modularity -- We have all learned that modular design is good "per se." But there aren't many systems that both qualify and benefit from such design the same way that a Java implementation does. Its OS-like functionality makes it natural to separate things like threading and memory management. Again, well-defined interfaces are the best way to approximate the ideal of "plug-in" JITs, garbage collectors, and other components. Of course, due to efficiency reasons, some interfaces don't qualify for runtime or loadtime configuration, they need to be resolved at compile time (e.g., the thread/locking subsystem).
The effects of suitable modularization can't be overestimated. If Java is going to be ubiquitous, there will be a huge variety of different platforms (set-top boxes, PDAs, desktops, servers) and applications (graphical, non-graphical, embedded, standalone). It does not make sense to serve all of them with the same components. A server might not need graphics, a set-top box application might not require filesystem I/O, and a multiprocessor system cannot benefit from non-kernel threads. "Scalability" is the magic word, and configurable modules are the road to success.
But there is more to it than that. An Open Source project like Kaffe has contributors all over the world. If this were a "monolithic" system, the effort to synchronize work would be difficult, tremendously boring, and would probably scare away most programmers who do this for fun. It seems to be a mandatory prerequisite for large Open Source projects to provide a sufficient level of modularization so that people can work independently of each other. In addition, being able to replace existing modules without touching the rest of the system is the best way to keep a complex system like Kaffe healthy (and up-to-date).
Besides these technical things, there are also a few significant legal reasons. Because Kaffe is not leashed by the Java trademark or license, people can publicly tweak their versions of Kaffe in directions that do not necessarily correspond to Sun's Java specifications (e.g., by supporting other languages). Even if it is not a good idea to pollute the Java standard (after all, Java is a success because it provides these standards), it is useful for many research projects (like Guaraná), new languages (like Kiev), or system integration (Java didn't start the fire -- there are standards outside of Java, too).
The main modulesWe can't delve too deeply into implementation details here. (That would turn this article into a book, which would probably be outdated by the time it got published.) But we certainly can identify the main modules and their future directions. If you aren't interested in this techie snapshot, you might want to skip ahead to the next section.
The complete source tree for the Custom Edition is still less than 4 MB.
Execution enginesCurrently, there is a somewhat outdated JIT, a simple interpreter, and an AOT (Ahead of Time) compiler interface to the Cygnus-GCJ system. The JIT suffers from a simple translation scheme and superfluous register loads/spills. But it has a portable design (which is a great achievement for multiple architectures) and already offers speed improvements of a factor of 2-3 compared to fast interpreters.
Looking for utmost optimizations (peephole, common subexpressions, etc.) is certainly the wrong idea for a JIT, because it usually doesn't make sense to spend a lot of time on the compilation of a method when it's only executed once. Global register allocation and inlining can make good improvements and qualify for a solution to the underlying optimization problem. The new JIT, soon to be released in both Kaffe versions, will feature both of the aforementioned techniques.
The interpreter can be accelerated, too (by means of jump optimizations). But with the advent of high-performance JITs, it is becoming increasingly ignored. Given the role of Java in small, embedded systems, this is not really justified, because the difference in memory consumption and disk footprint between JITs and interpreters is even getting bigger.
Using native, precompiled, heavily optimized class libraries is another very interesting approach. While the exclusive use of completely AOT compiled libraries (native applications) seems to be too restrictive, coexisting AOT libraries (e.g., for standard classes) and JITed/interpreted (application) classes are not. In fact, this might be the choice to speed up loadtime and bottlenecks.