[Eeglablist] distributed processing

Tue Mar 22 00:59:03 PST 2005

Dear Joseph (and other heavy EEGLAB users)

We have been performing specific (non-EEGLAB) Matlab computations in 
parallel on our linux "cluster", i.e. a loosely connected network of 
about 30 similar linux computers. This parallelization is part of our 
FieldTrip toolbox, which in the near future will become the back-end 
for dipolefitting and sourceanalysis in EEGLAB. My objectives were 
twofold:

* I wanted to evaluate Matlab code (not low-level c-code), and the 
Matlab code should be as "unaware" as possible of it being evaluated in 
parallel
* I wanted to work with relatively large chuncks of data, i.e. each 
chunk (subject/trial/whatever) that is evaluated in a separate job 
should be computationally large enoug to justify the overhead of 
sending the data over the network

For this purpose I have evaluated various open source parallel 
computing toolboxes, but found that none of them was suitable for my 
needs. I have also seen the recently released commercial DCT/DCE Matlab 
toolbox, but have no experience with that one yet. The most important 
problem that I faced (which I think also applies to the DCT/DCE 
toolbox) is that the parallel computations are performed in separate 
Matlab sessions. That means that each node in the cluster has to be 
running it's own Matlab session, and that one master-node is running 
the Matlab session that is controlling all of them. Although we are 
connected to the university wide license server with ~300 concurrent 
Matlab licenses, that does not neccesary mean that the number of 
licenses that I can use is infinite. Especially the licenses of the 
specialized toolboxes (signal processing, image processing, 
optimization, statistics) that I use puts a limit on the number of 
concurrent jobs that I can evaluate on our cluster (our university only 
has ~10 licenses of each of those toolboxes).

Since I want my computations to simply scale with the number of nodes 
that is available to me, without me having to buy additional licenses, 
the solution that I implemented is based on the Matlab compiler 
toolbox. Let me give an example: on the master node (the only one that 
has to be running Matlab) can type  something like
   a = rand(1000,1000,30);
   pfor(1:30, 'b(:,:,%d) = fft(a(:,:,%d))');
which is equivalent to executing
   for i=1:30
     b(:,:,i) = fft(a(:,:,i));
   end
What happens is that the fft function (or any other custom! function in 
its place) is wrapped into a m-function that is compiled into a 
standalone executable. Subsequently, the data for each job is written 
to a NFS shared disk and all jobs are remotely executed on the 
available nodes of the cluster. The only requirements are: compiler 
toolbox should be present on the master node, requires login (ssh/rsh) 
connections between nodes, and there should be a common filespace. I 
also tried around writing the data over the network (i.e. using TCP/IP 
network sockets), but found that that made it too complex.

I have been planning to make my parallelization toolbox available on 
the net, but sofar have not had time for it. The functions themselves 
are quite straightforward and include documentation. I should write 
some background documentation to them and more testing is required in a 
different (clean) environment. There are some environment variables and 
shared libraries that have to be set correctly for the standalone 
executable to work on the client nodes. Furthermore, I still have to 
improve support for  general-purpose cluster management software (such 
as MPI, GridEngine, gexec) with which I expect to obtain a more smooth 
loadbalancing of my job over the available nodes.

If you are interested in trying out my toolbox, please contact me and I 
will send you the code.

best regards,
Robert

----------------------------------------------------------------------
Robert Oostenveld, PhD
F.C. Donders Centre for Cognitive Neuroimaging
Radboud University Nijmegen
phone: +31-24-3619695
http://www.ru.nl/fcdonders/
----------------------------------------------------------------------
N.B. Starting from 1 September 2004, the University of Nijmegen has 
changed its name to Radboud University Nijmegen. All web- and 
email-addresses ending in ".kun.nl" should therefore be changed into 
".ru.nl". Please update your address book and links.