Thursday, October 15, 2009

Pipes & Filters, Layered Systems and Iterative Refinement

One of the groups in OPL consist of the structural patterns. These are not intended to document parallelism, but is intended to serve to give a good architectural base from which the parallelism can be exploited through the use of patterns at the lower layers. In the OPL authors' own words our inability to architecture parallel programs stem from our inability to architecture programs in general. Following this train of thoughts the first problem can not be solved without also solving the second.

This week we have read three of the structural patterns, namely Layered Systems, Pipe-and-filter and Iterative Refinement. The first of these patterns, the Layered Systems pattern, looked a bit like a draft and I thought it was lacking in explanation. I liked the POSA one better as it did a better job of explaining the concept and problems faced when trying to apply it. This one read more like a laundry list of things that are important when designing layered systems, and I do not disagree that these are important points, but I do not think I would get most of it if I did not already know the pattern.

The second pattern on Pipe-and-filter systems was better, but I am wondering why they don't provide a diagram of a pipeline or similar for illustration. I also missed a discussion on problems like error handling in pipelines. Should we handle that one the side, thus introducing extra communication, or should we flush the pipeline using poisonous pills when we detect an error? Or perhaps we should ignore it like one does in most media application pipelines?

They also mention in the forces section that the number of filters should be small to reduce pipeline fill/drain time. Unless they mean pipeline latency (the time it takes for one package to reach the other end which is higher if it has to be moved between more stages) I fail to see the reason. Pipeline fill time is usually seen as a problem because for an N-staged pipeline you have to process N packages before reaching full parallelism (N). This is in contrast to task parallelism where one can exploit all the parallelism immediately. However, a pipeline with N+M stages also only need N stages to reach a parallelism of N and the fact that it can reach a bigger amount of parallelism in more time doesn't really affect that. Also, say you have a pipeline of 6 stages with a processing time of 1 each. You then merge these into 3 stages with a processing time of 2 each. Admittedly you now only need 5 time units, as opposed to 3 time units to fill the whole pipeline, but the first case would only need 3 steps to occupy 3 processors while the latter case needs 5 steps. The problem with pipeline fill-time is that you have to wait for the pipeline to start filling up to get parallelism, but I can't see how that problem is solved by reducing the number of stages. Is there a fault in my argument here?

The last pattern discussed in this blog post is Iterative Refinement. If I understood it correctly this is a fairly common pattern that goes under a few other names such as BSP, Loosely Synchronous Computations and Compute/Communicate. It was presented as a general programming model by Lesilie Valiant (their reference) and it has been argued that it is to message passing as structured programming is to goto statements. In "Goto Considered Harmful" Dijkstra argues in favor of structured loops instead of goto statements because they allow you to pinpoint and think about a point in a programs execution based on its point in the source code, the call stack and a set of loop indices. In contrast with goto statements you have to consider the whole previous execution trace of that function. With BSP programs you can also pinpoint a point in the programs execution using the same information as structured programming plus iteration indexes giving you the same benefits. In addition to this it also has the effect of separating communications from computations which greatly reduced program complexity. One common usage of this pattern is in grid computations (image filtering, weather calculations) where the computation will be set up as a sequence of steps where different parts of the grid is computed on different processors independently of each other. Each step is then followed by a communication phase involving border exchanges before a new step is started.

No comments:

Post a Comment