[Eeglablist] MPICH and AMICA

Mon Nov 26 14:19:40 PST 2012

Hi Tom,

At SCCN, we have 11 nodes, with 3 set up for interactive use, and 8 nodes
(each 32 core) for jobs with qsub.

It is ideal for speed if different jobs can be run on separate nodes if
possible.

To minimize overlap, we have defined a set of nested queues that are
non-overlapping at each level. Specifically, we have the following queues

               q3   -   containing 32 slots on node 3 (one for each of the
32 cores)

                q4   -   containing 32 slots on node 4

                q5   -   containing 32 slots on node 5

                .

                q10  - continaing 32 slots on node 10

                qa1 - containing 64 slots on nodes 3 and 4

                qa2 - containing 64 slots on nodes 5 and 6

                qa3 - containing 64 slots on nodes 7 and 7

                qa4 - containing 64 slots on nodes 9 and 10

                qb1 - containing 128 slots on nodes 3 - 6

                qb2 - containing 128 slots on nodes 7 - 10

                qc1 - containing 256 slots on nodes 3 - 10

Nodes 0-2 are the interactive nodes.

Defining queues this way (under SGE) allows a user to gain exclusive use of
the node or nodes in a queue to run a single job, for the duration of the
job, assuming the node is available/empty when the job is submitted.

When running amica on a single node, you might want to make sure the
"maxthreads" variable in runamica12.m is set to the number of procs you want
to run (assuming the machine as multiple cores with available slots).

Our sysadmin, Robert Buffington, might be able to provide some sample config
files for SGE.

Best,

Jason

From: eeglablist-bounces at sccn.ucsd.edu
[mailto:eeglablist-bounces at sccn.ucsd.edu] On Behalf Of Tom Campbell
Sent: Thursday, November 22, 2012 4:09 AM
To: eeglab list; Hendrik Kayser
Subject: [Eeglablist] MPICH and AMICA

Hi,

I am setting up AMICA on a Rocks cluster of 142 CPUs to work on ABR data. I
edited AMICA12 so it runs locally on one compute node but this takes over 8
hours for the Memorize.fdt and we have bigger plans. It seems we do not have
MPICH configured on the cluster yet. There are about 60 people using this
cluster and I don't want to rock the boat. People rather just run separate
jobs on separate nodes at the same time. Please do any of you have any
experience and suggestions for configuring MPICH and how to minimize
disruption of queues of other people's jobs on such a cluster? I'm
interested in what worked or didn't work for you please.

Best regards,

Tom.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sccn.ucsd.edu/pipermail/eeglablist/attachments/20121126/d8d8f74a/attachment.html>