Programming and Webmasters forum
HomeSearchRecent PostsLoginRegister Contact Us
 

Username  
Password
Pages: [1]   Go Down
  Email this topic  |  Print
0 Members and 2 Guests are viewing this topic.

Parallel Programming

 
webmaster forum
polas  Offline
*
 
Hacker
Gender: Male
Posts: 1172
Topics: 77
WWW
January 16, 2009, 07:49:58 AM


Parallel computing is the use of multiple processors in order to solve a specific task. For quite a few decades now parallelism has been used in the domain of High Performance Computing (HPC) where large, difficult problems are split up into pieces which are solved and then recombined to form the answer. Popular issues such as weather prediction and cosmological simulation have been greatly helped by this form of computing. However, nowadays there is an extra dimension to the whole issue - with multiple cores per processor common place (and only expected to increase) it is important that programmers stop writing sequential code and start writing parallel code - otherwise we will have processors with 64 or 128 cores which are very much underused. Many programmers consider parallel programming "hard"..... and quite rightly so.

In this article I will explain some of the elementary concepts of parallel computing and point the reader to further points of information.

Problem Decomposition

When we wish to solve a problem via parallel computing we need to decide how to decompose (or divide) it up most effectively. There are two broad ways this can be done.

Task Parallelism - Here a problem is divided into distinct tasks, with each processor running the task and then recombining the results at the end. Lets imagine you have a fence to paint - there are two jobs (firstly preparing the fence and secondly painting it) and there are two of you to do it. In task parallelism one person would prepare the fence whilst the other person would paint it. A bit more computer orientated, lets imagine a typical desktop computer with a dual core processor (two processors.) The user might wish to have a word processor open and also be playing on a game at the same time.... in task parallelism one CPU might run the word processor whilst the other would run the game.

Data Parallelism - Here each processor does exactly the same task, its just that the data they work on is different. Lets go back to the fence example - here both would prepare the fence and then paint it, however one would work on one side of the fence and the other would work on the other side of the fence and aim to meet in the middle. Dividing a problem via data parallelism tends to allow for more gain than using task parallelism.... lets go back to our computer example. Here the game might take loads of CPU resources whilst the word processor takes very few. Using task parallelism CPU 1 (with the word processor on) is only 10% used, whilst CPU 2 (with the game on) is 100%+ used. Using data parallelism the game could be running on both CPUs, with CPU 1 working on half the game data and CPU 2 working on the other half thus resulting in a better use of the CPU resources. Data parallelism also scales much better than task parallelism as the problem size increases.

Problem Communication

Unfortunately, for us as programmers, during parallel computing often one process needs to communicate with another which adds a whole new level of complexity. For instance, during weather prediction if one processor is trying to decide if it will be warm or hot where you live it will need to take into account lots of different things - some of these might be on other processes and as such it will need communicate with them. We can categorize problems, depending on their communication. in a number of ways.

Embarrassingly Parallel - Here it requires very little or no effort to divide the problem up into parallel subtasks. There is no communication or dependency required between processes - quite rare in practice, but they do exist, an example is the two (data parallel) fence painters above and a bit more technical the Mandelbrot calculation.

Coarse Grained - Here processes do communicate but not all that frequently and are not particularly dependent on each other. Something known as a master-slave design pattern is a common example of this - the slaves do computation and communicate their results to the master process which "glues" them all together.


Fine Grained - Here processes are very tightly connected, with a high level of communication and dependency on each other. These problems are often difficult to parallelise and, due to the high cost of communication, sometimes not worth parallelising either.

In reality many problems sit somewhere between these classifications.

Communication Models

It is important that communication between processors is efficient, easy to program and scalable. There are two main models of communication, Shared memory and Message Passing, which are realized both in hardware and as an abstract programming model irrespective of the hardware.

Shared Memory - Here all the processors are connected to the same (shared) memory. They can all access any part of the memory and as such communicating with each other (and sharing results) is easy to do. In hardware the Symmetric Multi Processors (SMPs) architecture that modern dual core (and core duo) follow is shared memory. Programming model wise, programmers can abstract many machines (such as a cluster) into the shared memory architecture using a library such as Bulk Synchronous Parallelism (BSP.) Although shared memory is relatively simple to program, it is not efficient and really does not scale well. Hardware wise, the more processors you add, the further away from memory they are and this will be a real challenge for SMPs to overcome in future years. Because of the drawbacks, few serious HPC programmers use shared memory. The diagram below shows the typical architecture of a shared memory (SMP) system.


Message Passing - Here all processors are completely separate from each other, each has its own memory and they are connected only via a link (such as a network link in a cluster) and in order to communicate with each other they have to send messages. Typical messages include point to point (synchronous and asynchronous), broadcast and scattering data. Whilst this approach is highly efficient and very scalable there are a number of difficulties for the programmer. For instance, the programmer must ensure that there is no deadlock (where two processes are both waiting for the other one to do something before continuing) and must ensure that messages are received in the correct order and interpreted correctly. Although there is much extra burden placed onto the programmer, this model is by far the most popular with HPC programmers and has meant that up to this point parallel programming has been in the domain of the few experts. The Message Passing Interface (MPI) is one of the most popular message passing standards and is most commonly used in conjunction with C, C++ and Fortran. Even with SMPs which follow the shared memory architecture, one can still use the message passing paradigm for communication, which is still often more efficient than programming shared memory style.

Parallel Programming Languages

Whilst there are quite a few parallel programming languages out there, up to this point much parallel programming (certainly over the large scale) has been done in C linked with a parallel communication library for efficiency. Due to the low level, sequential, nature of C this really is not ideal. Common libraries which can be used from sequential languages (such as C) include for message passing MPICH and Open MPI (both implement MPI) and for shared memory BSPLib, POSIX Threads and OpenMP.  

The advantage of using a parallel language is that parallelism is "built in", often resulting in easier to write code. Some which you may be interested learning more about include Mesham (http://www.mesham.com), High Performance Fortran (HPF), Co-Array Fortran (CAF), Unified Parallel C (UPC), Cilk, ZPL and NESL.... and many more, a google search should highlight quite a few which I have not listed.

Watch This Space


Up until the explosion of multi core chips we were all happy to let parallel programming lie within the domain of experts, and little was done to address this. However, now we realize that soon all programmers will need to start writing parallel code there has been much work done in making this easier and simpler to do. I myself am researching parallel programming languages and paradigms and generally this field has started to move very rapidly. In addition to multi core chips, people are also starting to use GPUs for general purpose computing - these often comprise of many cores and in the future we could be also using this resource to provide a huge amount of extra CPU power, the Nvidia's Tesla to the left is really just the start. For more information about GPGPU, have a look at CUDA.

Images courtesy of Wikipedia





Article Source: http://www.TechnicalTalk.net

You can find more details here


*** Please do not forget to leave your comments about this article ***

« Last Edit: February 04, 2010, 12:01:31 PM by Admin »

 
webmaster forum
rednael  Offline
 
New Poster
Posts: 1
Topics: 0
February 05, 2009, 06:46:18 AM

Please, also read the following article:
http://blog.rednael.com/2009/02/05/ParallelProgrammingUsingTheParallelFramework.aspx

It's an article about basic parallel programming. Examples in C# .Net included. Also, it describes a lightweight parallel framework to work with tasks. Opposed to some other frameworks, this one is very light and very easy to use.
After reading this article, you should be able to write code using parallelism.

Regards,
Martijn

 
webmaster forum
plagosus  Offline
 
New Coder
Location: Serbia
Gender: Male
Age: 22
Posts: 44
Topics: 3
WWW
February 05, 2009, 02:05:39 PM

I have never wrote some paralel computing code, but I would like to learn to do it... Also, I would like to try some GPU computing stuff...

Is there any cool tutorial with simple examples in C or JAVA?
 
webmaster forum
polas  Offline
*
 
Hacker
Gender: Male
Posts: 1172
Topics: 77
WWW
February 05, 2009, 03:45:21 PM

There are many options available to you Plagosus. For parallel programming, personally I would suggest taking a look at MPI for a start. The Message Passing Interface (MPI) is a very popular message passing standard. Implementations such as MPICH and OpenMPI have bindings for C (really the language they are designed for), C++ and Fortran. Although Java is rarely used for HPC, an implementation such as MPJ does the job. Certainly with MPICH and OpenMPI you will find then much easier to use on *nix rather than Windows (I don't think there is even a Windows version of OpenMPI.) From here there are other options to persue, including a multitude of languages designed purely for parallel programming.

For GPU programming have a look at CUDA.

 
  Email this topic  |  Print
Pages: [1]   Go Up
 
Jump to:  



Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC


Google visited last this page February 04, 2010, 02:11:54 AM

Valid XHTML 1.0 Transitional     Valid XHTML 1.0 Transitional