Wednesday, 1 October 2008

JVM languages summit

Last week I attended the JVM languages summit . The 70 or so conference attendees were a mixture of VM experts and language implementers. I felt myself to be amongst a smaller subgroup of the language implementers, those who as Charlie Nutter put it are implementing "off platform" languages on the JVM. In this group the JRuby and Jython community were very well represented with multiple members of both core development teams in attendance. I was there representing PHP on behalf of Project Zero.

For me the conference was a great success. It was great to meet some kindred spirits and even better to learn that many of the problems that have vexed us during this year are common to all the off-platform languages. There seem to be many areas where the various languages can help each other and I plan to participate actively on the JVM languages group.

I struggled to take notes of everything I wanted to remember so I've already been back though the slides linked from the agenda and I'm waiting eagerly for the videos of the talks to be posted.

All of the sessions were enjoyable and interesting. Below I pick out the sessions which were particularly relevant to the work I do.

Day 1.

John Rose spoke about the Da Vinci Machine. John explained how the existing JVM features make it a great platform for building new language implementations but that a few features needed to achieve excellent performance for some dynamic languages are missing. Foremost amongst these are Dynamic Invocation and Method Handles which are covered by JSR292 but he is also looking at further improvements. Whilst some of these such as tail-calls and tail recursion are mostly important for functional programming languages others such as lightweight bytecode loading would be generally useful. I can certainly see how we would use Anonymous Classes for PHP. I have my fingers crossed that more of these ideas make it into JSRs so that languages can exploit them when running on standard JVM implementations.

It was good to hear during this talk that Sun regard JSR292 as being as certain to get into Java 7 as any JSR is.

Bernd Mathiske's Maxine VM research project is fascinating. He has built a JVM using Java as the implementation language. This project has the potential to allow language implementers to prototype new JVM features without knowing the internals of one of the JVM implementations which by various accounts can be a frustrating experience.

Day 2.

The talk I had been waiting for was Charlie Nutter on JRuby and he didn't disappoint. Charlie ran through at a high level each of the important design decisions that JRuby had made and why they were made that way. In fact in some cases he described multiple implementations that had been tried and explained which had worked out best. Now PHP does not have exactly the same set of problems as JRuby but it was interesting that JRuby had settled on a Monomorphic call-site cache after trying a PIC since we had been thinking about switching from Monomorphic to PIC.

It was interesting too to hear how JRuby deals with the issue that seems to arise in all of these off-platform languages - strings. It turns out that the native-C implementations of PHP, JRuby and Python all use an array of bytes to represent both binary objects such as a JPEG as well as character data such as "hello". The JVM of course has a Unicode String object and would naturally use a byte[] to represent raw binary data. Ruby, PHP and Python all have plans to move to a Unicode implementations but none has completed this transition. So Charlie told us that JRuby chose to take the route of representing everything as a byte[]. This means a bespoke implementation of all string handling. None of the Java standard string handling library can be used. I later found out that Jython has taken the opposite approach of using Java Unicode strings for everything......which means that a jpeg would be stored mangled into Unicode, a strategy which works so long as the encoding being used round-trips. We took a middle approach for PHP. We have a composite object which stores either a unicode String or a byte[] depending on how the string entered the runtime. This is a design that Charlie said he was considering.

Eugene Kuleshov gave a great talk on ASM. More than half of the attendees, myself included, were using ASM so it was a popular session. Eugene mentioned that he had received multiple complaints from folks about the absence of error checking in ASM and that he had to point out that we should be using the error checking CheckClassAdapter. With a red face I realised that we had fallen into the same trap. I feel some refactoring coming on....

I gave a short talk on our PHP work which seemed to go down well. I couldn't help commenting on the fact that the conference was organised using a Wiki implemented in PHP.

The other highlight for me was Frank Wierzbicki's short Jython talk. Frank could not go into as much detail as Charlie since he only had half the time. Frank explained that the Python community are very supportive of Jython. Every Python extension is implemented both in C and in Python script. This allows Jython to very easily support all the extensions. I wish the same were true of PHP. It could be. This is certainly something I'd like to discuss with the PHP community next time I'm at a PHP conference.

Day 3.

For me the highlight of day 3 was a small subgroup formed at lunchtime to discuss and better understand JSR292. I wanted to understand how Charlie' s excellent blog post related to what will actually be in JSR292 as it seemed that Charlie was making use of facilities in the MLVM which are not described in JSR292. I think I now understand that JSR292 will contain InvokeDynamic and MethodHandles but does not need to contain the anonymous class loader java.dyn.AnonymousClassLoader that Charlie described. I guess that means that AnonymousClassLoader will need to move to another package.


Overall this was a very enjoyable and interesting conference both for the formal presentations and for the informal open-spaces style conversations in the breaks.

Many thanks to Brian Goetz and John Rose for pulling this together. Well done.

No comments: