Monday, September 28, 2009

Our Pattern Language

Our Pattern Language is a pattern language developed by several parties, including Professor Johnson's group at UIUC, but that is hosted by Berkeley. The name is lent from Alexander's pattern language and is supposedly a placeholder for another future name. It is interesting to see how much clearer this new paper is than earlier papers and versions of the OPL wiki and the pattern language definitely seems to be maturing.

OPL is a hierarchical set of patterns divided into five groups. One of the underpinnings of OPL is the arguable point that our inability to architect parallel software stems from our inability to architect software in general. The first group therefore contains general architecture patterns from which the parallel implementation should flow. The next two groups contain the computational problems in parallel computing and algorithms to implement parallelism. The lower groups contain programming models and primitives that are necessary to implement these algorithms.

Another driving idea behind OPL is that research in parallel computing should be driven by the applications that will actually be written. The language therefore includes the 7 dwarfs of super computing and the rest of the 13 more general parallel motifs recast as Computational patterns. These patterns break a little with the common way of framing patterns, namely as a solution to a recurring problem in context. Instead the are broad categories of solutions to a large number of possible problems. One way to get around this that I think works quite well is to treat them as pattern languages consisting of a number of more concrete solutions to problems encountered in the pattern language domain. One can then discuss the forces that are relevant in the domain at large before presenting the common patterns and the tradeoffs in selecting one or the other. This is the approach taken in the paper "N-Body Pattern Language" by Danny Dig, Professor Johnson and Professor Snir.

One point about the computational patterns that I found enlightening it is that they do not cover all problems solved with parallel programming. Many projects use threading or multiple processes for other reasons such as implementing threading to avoid hogging the GUI thread or to wait for and collect hardware interrupts. Other examples mentioned in this course are Professor King's web browser where parallelism is mainly for security and Erlang/Guardian where parallelism was originally mostly for reliability. However, the computational patterns are the problems that need an almost inexhaustible supply of processors and where we can always efficiently use more processors by increasing the problem size. The other problems mentioned above are inherently constrained in how much parallelism one can expose; the computational patterns are not. As such, if we can find just a few killer desktop application that really needs to solve one of these problems then we have a reason to continue making chips with more processors. Otherwise, as Professor Patterson puts it, we are on the road to commoditization and consumers will (rightly) demand cheaper instead of more powerful hardware. After all we don't really want to spend 97 out of 100 cores detecting viruses and there are only so much embarrassingly parallel concurrency available through the user running different programs.

The paper identifies four classes of developers and presents a vision that we will architect software in such a way that each kind of developer will only need to concern themselves with the problems at their layer. One example of this separation of concerns is that an application developer (the largest group of programmers by far) should only need to be "weakly aware of parallel programming". This is a nice vision and is the way it currently works in domains such as enterprise systems where parallelism is ubiquitous, but where most programmers do not spend time thinking about it. If we could achieve this for all the thirteen computational patterns and then find killer applications that need these types of computations then the problem would be largely solved and most developers would not even need much re-education!

However, I do think this will be harder than it was with web-servers. The difference this time around is that the problems have much more complex communication patterns. With web servers each client can be run in parallel in different sessions. The sessions are for the most part independent of all other sessions and only communicate with them through a database management system (that much effort has gone into making parallel). Most of the computational patterns on the other hand have more complex and intrusive communication patterns making it harder to write framework that are applicable to many applications. But then again frameworks don't really need to be that general.

No comments:

Post a Comment