Parallel Programming Concepts

7. Costs

By this point, I hope you will have gotten the joint message that:

  1. Parallel processing can be extremely useful, but...
  2. ... TANSTAAFL (There Ain't No Such Thing As A Free Lunch)
I.e., you can get great rewards from parallelizing, but you'll likely sweat blood getting there; now, that's not always the case, but it's better that you assume it will be, and be pleasantly surprised when it goes quickly and smoothly, than expecting everything will go smoothly and ending up mired to your neck in problems.

Here are some of the more significant ways that you can expect to spend time and encounter problems:

  • Programmer's time

    As the programmer, your time is largely going to be spent doing the following:

    • Analyzing code for parallelism

      The more significant parallelism you can find, not simply in the existing code, but even more importantly in the overall task that the code is intended to address, the more speedup you can expect to obtain for your efforts.

    • Recoding

      Having discovered the places where you think parallelism will give results, you now have to put it in. This can be a very time-consuming process.

  • Complicated debugging

    One of the nice things about serial code is that, in the end, there's only one instruction at a time being executed, and you could, if you had to, get an instruction-level dump of the whole thing and stand a good chance of finding that last, elusive bug. Debugging a parallel application is at least an order of magnitude more infuriating, because you not only have multiple instruction streams running around doing things at the same time, you've also got information flowing amongst them all, again all at the same time, and who knows!?! what's causing the errors you're seeing?

    It really is that bad. Trust me.

    Do whatever you can to avoid having to debug parallel code:

    • consider a career change;
    • hire someone else to do it;
    • or write the best, self-debugging, modular and error-correcting code you possibly can, the first time.
    If you decide to stick with it, and follow the advice in that last point, you'll find that the time you put into writing good, well-designed code has a tremendous impact on how quickly you get it running correctly. Pay the price up front.

  • Loss of portability of code

    Serial code is serial code; sure, different serial machines have different dialects of your favorite serial language, but standards committees and software developers' need for portability are forcing a very welcome commonality in most of them.

    But, when you convert your code to parallel, there ain't no goin' back -- what you end up with (assuming you are not using parallel constructs hidden inside comments) will never again run on a serial machine ... well, that's not entirely true, in all cases, but it's better that you expect to have to support two completely different packages, one for serial environments, and one each for every different kind of parallel environment you want your application to be able to run on.

    Things are getting a little brighter on this score, however: there are standards efforts underway (at least for Fortran) to insure portability of parallel programs. For example, High Performance Fortran (HPF) runs on a variety of platforms, and the entire MPI effort was directed at being able to provide message-passing portability on top of a wide range of underlying transport environments.

  • Total CPU time greater with parallel

    A good job of parallelization will end up reducing the wall-clock time you spend waiting for your application to finish; however, the bill you get back from your service center is not likely to be based on time as measured by your trusty Timex ... if it were, they wouldn't stay in business very long. No, they're going to add up the CPU time you racked up over all of the processors you used (after all, if you used them, no one else could, right?), and bill you for that.

    Even on a per-CPU basis, you're going to see that a parallel task runs up a higher bill than the equivalent serial one; as previously explained, this increase is due to the additional instructions and time required to:

    • Initialize and terminate tasks
    • Communicate among tasks

  • Replication of code and data requires more memory

    Your service center bill may take into consideration things other than CPU cycles, such as how much disk and main memory you use. A serial task uses a fixed amount of memory. A skillfully written parallel one will distribute most of it across the processors, but there will always be some values that are replicated in all tasks and some buffers used for communication that will make the total memory used by a set of parallel tasks greater than what the serial task had used.

  • Other users might wait longer for their work

    The extra CPU time, disk space, and memory that your parallel application requires will not be available to other users of the system while you are using them. Show consideration for your fellow parallelizers -- use only the resources you actually need, and only for as long as you actually need them.