Wednesday 26 November 2008

Trace of Sun's JVM in V8



Short version:

I suspect a few files within V8 source code have Sun's JVM as their origin.


Lil' longer version:

Let me introduce our guests...

...V8 is the high-performance javascript engine Google developed for their web browser Chrome. One of the reasons why V8 seems to outperform other engines is it compiles javascript to native machine code (currently just for x86 and ARM) instead of interpreting it. V8 is written in C++ and is distributed under BSD license.

...Sun's Java Virtual Machine, codenamed hotspot, which was released some time ago as part of OpenJDK, among other things, provides a Just-in-Time (JIT) java bytecode compiler which on the fly "translates" bytecode fragments into machine code. Hotspot is written in C++ and is distributed under GPL license.

A couple of days ago I was tinkering with V8's code, concretely with the x86 machine code generation stuff, and I came across a surprising copyright note in file assembler-ia32.h:

// Copyright (c) 1994-2006 Sun Microsystems Inc.
// All Rights Reserved.

Yes, it is Sun's copyright note! And it can be also found in other files: assembler-ia32.cc, assembler.cc, assembler.h, assembler-arm.h, etc.

After that copyright note, you can find the BSD license, and after, this comment:

// The original source code covered by the above license above has been
// modified significantly by Google Inc.
// Copyright 2006-2008 the V8 project authors. All rights reserved.

I immediately googled around to find out something more on that, and I found someone who already noticed it, although he didn't go any further from the discovery thing.

Intrigued for it, I searched which piece of software from Sun Google took the original files from, and I reached to OpenJDK: if you carefully examine files assembler-ia32.h from V8 and assembler_x86_32.hpp from OpenJDK you'll notice how similar they are.

But I found it strange that V8's supposedly derived code, released under BSD license, had been initially taken from Sun's JVM which is distributed under GPL, as Google would be violating the license. Also, hotspot lacks ARM support while V8's ARM assembler files do have Sun's copyright note.

After taking some history lessons I found the actual nexus between V8 and Sun: Lars Bak, V8 tech leader, is a former employee of Sun where he worked in StrongTalk, Self, Hotspot and in...

...Monty VM which powers CLDC, (Sun's high performance Java ME implementation) and supports x86 and ARM architectures. It was open sourced as part of phoneMe under GPL license.

BINGO!

Although I still have no answer for the licensing thing (perhaps Sun licensed some parts of MontyVM to Google under BSD?), I think the origin of V8 Sun-copyrighted files is MontyVM JIT compiler.

Maybe I am right, maybe not, who cares, but this "investigation" has granted me interesting background on where the concepts implemented in V8 come from and hopefully will help me understand V8's design decissions as I digg deeper and deeper in its code :-)


UPDATE:
I found a more feasible origin for Sun's copyrighted stuff: StrongTalk VM (open sourced by Sun, now hosted at google code); I think this is the actual origin because:
    has the suspect file.
    Lars Bak was part of its development team.
    it is distributed under BSD license.

3 comments:

  1. When I say V8 outperforms other engines, I am consciously not mentioning TraceMonkey to avoid flamewars and alike, but I know there are benchmarks out there (checkout the blogosphere for performance comparisons among various javascript engines) that show TraceMonkey is quite faster than V8 (not yet in recursion-heavy code).

    ReplyDelete
  2. It's interesting to see this cross-licensing between Sun and Google. I guess Sun's business model is such that they need more control over Java, which is why GPL is the right choice for them. Liberating this small part of the code base under BSD probably cost Google hard cash, I'd say. But Google don't care what you do with their client-side code, since in the end you will use their services anyway. They fully control these proprietary services anyway. So BSD or Apache works for Google.

    Another point you may find intriguing is that Google Code hosting has decided not to support the AGPL licence, despite this licence being approved by OSI and recently by Debian's licencing team too. http://code.google.com/p/support/issues/detail?id=1131 - if you recall AGPL is like GPL, but for services. If you run AGPL code on your server, you must also provide the code for that service. Normal GPL only requires the code if you distribute the program, but you are under no obligation if you just use the program. Not so here, if you use the program on a public-facing server, then you have to distribute the code to the users too.

    I suppose this would be very irritating for Google. Imagine if Subversion or MySQL were licenced AGPL, they just couldn't use either since they would be obliged to liberate their changes (the Subversion back end on Google Code has changes to make use of the Big Table stuff, likewise MySQL). Now without Google's proprietary Big Table these changes are worthless anyway. It's still worth noting that, for Google at least, the client is anyone's but you better not mess with their Cloud. Meh.

    ReplyDelete
  3. I completely agree. Google still keeps their point about avoiding unworthy (in their opinion) open source licenses proliferation despite their incident with the Mozilla Public License, which they first removed from the available licenses for Google Code projects and later re-admitted under community pressure (despite Debian guys do not think it is free: http://lists.debian.org/debian-legal/2004/06/msg00221.html), being the re-admission of the MPL had also attached the inclusion of the Eclipse Public License (EPL), which is barely used by eclipse and its ecosystem...
    Anyway, it should be no surprise that Google is an enterprise and so shall protect their business, which can effectively be hurt if AGPL becomes popular among open source initiatives.

    ReplyDelete