Thursday, October 22, 2009

Some thoughts on Erlang for Parallel Computing

I like the Erlang model and appreciate its impressive results in its domain so I have been thinking about how it would work for other problems like the computational patterns. Modules of an application where several code paths are executed at the same time can be split into to groups.

You have those applications in which the concurrency is a part of the problem specification. Two common examples of these are web servers and switching systems where there are multiple concurrent users. Another example is GUI event loops where the events are naturally concurrent. A fourth example is web browsers where the many tabs naturally operate concurrent. In these systems the simultaneous execution are often referred to as concurrency. Furthermore, introducing simultaneous computation/actual concurrency into these systems often does not make them that much more difficult to understand, and sometimes can even make them easier to understand. Just look at web development where developers write code on a daily basis that execute concurrently without giving it a second thought. In these systems the introduction of concurrency can give many benefits such as performance, scalability, reliability, stability, security, modularity or even maintainability and simplicity. The key point is that concurrency is a natural part of the problem specification and does not need to be uncovered.

Then there are those problems where concurrency is not a part of the problem specification, but where there is still concurrency in the domain that can be exposed and used to enhance some quality attribute (As I have explained in an earlier blog, I prefer this term to non-functional requirement). The qualify attributes in question are often performance and scalability, but can also be reliability, stability and security. In these systems the simultaneous execution is usually refer to as parallelism. Examples are solutions to the thirteen dwarfs/motifs/computational patterns where parallelism is usually exploited for performance and scalability or Google Chrome where they will spawn of processes, independent of the tab processes, to render certain risky parts of HTML code in order to provide increased security. Since there is no natural concurrency in the problem specification of these problems we have to find ways to restructure the problem to expose it. This is hard and usually leads to units of parallel execution that are closely coupled and that need to communicate often and in non-trivial ways.

Erlang and "Concurrency Oriented Programming" focuses on the first set of problems where the problems are inherently concurrent and programmers are therefore encouraged to map the concurrency in the solution 1:1 with the concurrency of the problem (people in the world or transactions in a switch). Some of the challenges are then to provide a system where processes are cheap and to provide separation between the different entities so that problems in one does not spread. These problems are nicely solved by Erlang through the use of structured protocol-driven message passing and green processes.

The second problem area is harder to solve and it is this area that the OPL effort focuses on the most. Erlang could probably be useful in solving some of these problems, but for problems like the computational patterns I have more reservations. Setting aside, for a moment, the fact that these are problems in which performance is usually the goal, which raises questions about the efficiency of a high-level language like Erlang especially for supercomputing, I do not think the model of Erlang is the ideal fit. The programming model of Erlang is one where independent processes work independently and cooperate by explicit message passing. Admittedly, many of the models that we use in supercomputing today like MPI and Charm++, share these characteristics, but these are often seen as complicated to use for programmers.

As I mentioned above Armstrong argues for mapping the concurrency of our problem 1:1 with the concurrency of the world. I agree with this, which leads me to the conclusion that it would be beneficial if we could use a sequential programming model for the problems that are, or at least we think of as, sequential. However, running them sequentially is not an option. We want to simulate tomorrows weather in hours and not in days so in the absence of increased sequential speed we must parallelize the computation. However, a "mostly" sequential programming model can still be parallelized given enough information to the compiler through parallel loops, etc. These conserve most of the sequential reasoning, but gives us parallel execution. Like we can choose to map concurrent problems to one or a few processors through time sharing we can map sequential programs to several processors in parallel. One example of such a programming model is OpenMP, which is popular for shared-memory systems but not very suitable for distributed systems mostly because it is shared-memory. However, this is an active area of research and one can envision sequential programming models that expose primitives that lets the programmer specify data locality giving the compiler enough information to distribute the computations satisfactorily.

No comments:

Post a Comment