Thursday, October 8, 2009

ReLooper: Refactoring for Loop Parallelism

ReLooper is another refactoring tool that converts normal Java arrays into one of the new Java ParallelArray constructs (JSR166y). Using this tool one can select a Java array and ask to have it converted to a ParallelArray. The tool will then perform the necessary conversion (changing types and replacing calls to new with calls to the ParallelArray factory methods). It will also analyze the loops that the array is used in and try to convert them to an equivalent ParallelArray method that perform the same operations, but in parallel.

In order to maintain semantic equivalence ReLooper performs a pretty impressive analysis of the loop content to detect potential race conditions. This include checking that no loop iterations reads or writes to a variable that another loop iteration writes to (Bernstein's conditions), but it also verifies that there are no IO operations as this can block one of the underlying worker threads, leading to poor performance or even cause the loops to become sequential again.

What I like best about this tool is the analysis part. In the examples given not many lines are changed (12-52 in their examples) and I could fairly easily do these small changes myself, but the fact that it can guarantee that I did not introduce a bug and that the other tens or hundreds of thousands lines still work is a big time saver. If I were to do this myself for a big and complex application I would probably spend 30 minutes converting the original array and hours or days convincing myself that everything still works. Moreover the fact that they, being conservative and treating objects as one block, get so few false warnings is impressive.

One thing that has previously come up is that an industrial strength tool should also do splitting and merging of loops. Say for instance the creation, moving and reduction of the particles in their example was done in the same loop then this tool would need to implement that loop as a reduction loop (since cm is still mutated) and would miss the opportunities to use the more general apply and replaceWithGeneratedValue methods on the first two parts of the input. However, research, like development, is incremental and as Professor Johnson nicely puts it: "When you see a flying elephant you don't complain about its style".

Another thing that again strikes me is the fairly big amount of boilerplate that are required to use something like parallel arrays in Java. One have to find the right class to implement amongst 132 classes and extend it, thereby creating a new class representing the content of the loop body. As one guy mentioned in a SE seminar where we discussed this paper: "The number 132 is probably the best argument for lambda expressions in Java". At least Java has anonymous classes as I have seen C++ TBB code and it is far uglier than this (and I consider myself a C++ guy). However, if C++ gets lambda expressions in the next version or if one use Apple's closure extensions these kind of libraries become much less intrusive.

Finally I would like to address Danny Dig's third question on this paper, which asks us to compare and contrast putting the programmer in the driver seat as opposed to fully automating the transition to a parallel program (compilers and similar). I myself like to be in control of my code and the transformations that are applied to it. I do not trust automatic transformations that I can not oversee and that I can not be sure is still there after I perform a refactoring or fix a bug. Although the former is a somewhat psychological point I believe many programmers feel this way, which makes it an important one. Furthermore, I find compiler parallelization to be brittle and if I was working on a program that depended on this then I would feel inclined to have fairly extensive performance regression tests attempting to verify that the parallel loop did not suddenly become sequential because a minor change confused the compiler. I therefore much prefer the approach of this paper as it puts me in charge, lets me see and approve of changes to my code and updates my code to reflect these changes. The latter is important to me as it means I can read the code too see how it will be run and do not need to read compiler manuals or logs to know whether it will be parallel or not. I also really liked JLSjr's point about the educational value of such a tool. I'm sure that after performing a few of these interactive transformations the developer would be much more comfortable with having "evil" parallelism in his code.

No comments:

Post a Comment