[Eeglablist] Installing CUDAICA on Windows 10 (2021 update)
Bruzadin Nunes, Ugo
ugob at siu.edu
Sun Nov 14 12:43:29 PST 2021
Thank you for your kind words, it means a lot to me.
I'm not sure if I'm eligible for a full hacking lecture, but I do hack eeglab functions quite a lot, so maybe!
Since you mentioned that DipFit is one of the slowest functions on a pipeline, I thought I could share at least one of my functions, which may be useful if you are running dipfit.
[https://urldefense.proofpoint.com/v2/url?u=https-3A__opengraph.githubassets.com_bebdc8c3051e7cce9187e828150f9f99dcc8d98e3f7f3a354cc63ba169c80275_UgoBruzadin_Par-5FDipFit&d=DwIGaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=YW4i7UAQZukvmOx6fECrFTobFEXbObYYdFj26PZXTyPLWDG1drA6a5fGknmX-if6&s=MhXX56gPrjOfgZRqToGFjCFj4zFRTXNbuLOJyYaqC9E&e= ]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_UgoBruzadin_Par-5FDipFit&d=DwIGaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=YW4i7UAQZukvmOx6fECrFTobFEXbObYYdFj26PZXTyPLWDG1drA6a5fGknmX-if6&s=SDifKqVuRKbVnl9l0y0sfj65IG8elBDc7Tq-UtHD3G0&e= >
GitHub - UgoBruzadin/Par_DipFit: Contains the function for parallel dipfit<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_UgoBruzadin_Par-5FDipFit&d=DwIGaQ&c=-35OiAkTchMrZOngvJPOeA&r=kB5f6DjXkuOQpM1bq5OFA9kKiQyNm1p6x6e36h3EglE&m=YW4i7UAQZukvmOx6fECrFTobFEXbObYYdFj26PZXTyPLWDG1drA6a5fGknmX-if6&s=SDifKqVuRKbVnl9l0y0sfj65IG8elBDc7Tq-UtHD3G0&e= >
Contains the function for parallel dipfit. Contribute to UgoBruzadin/Par_DipFit development by creating an account on GitHub.
If you have Parallel Computing Toolbox, you can run this function instead of the normal multifit, and it will perform the dipfit using as many cores/workers as you have available, considerably speeding up the process (by as many cores as you have, the more the merrier!) My environment has 12 cores, so this par_multifit runs ~12x faster; in my other environment I have 4 cores, so it runs par_multifit ~4 times faster than the normal multifit.
The coding isn't perfect, but it should be useful for anyone that streamlines dipfitting, or uses it on a daily basis. It may use a bit more RAM than normal, since the EEG needs to be copied and preallocated for parallel processing. It may require editing other functions to include this in your pipeline. Any suggestions or bugs, please let me know!
I have the whole order of dipfit functions running my defaults in one button on my big plugin called QuickLab, which should come out soon, but it uses defaults that may be different for different people, so the plugin still needs lots of adjustments to be independent of my own dataset.
In any case, I hope it's useful!
Ugo Bruzadin Nunes, Ph.D. Candidate (He/Him/His)
Visiting Assistant Professor, Psychology
Office Location: ISB room 316
Office Number: (314) 968-7677
Ugo at webster.edu<mailto:UgoBruzadinNunes at webster.edu>
From: Makoto Miyakoshi <mmiyakoshi at ucsd.edu>
Sent: Friday, November 12, 2021 3:40 PM
To: eeglablist at sccn.ucsd.edu <eeglablist at sccn.ucsd.edu>
Subject: Re: [Eeglablist] Installing CUDAICA on Windows 10 (2021 update)
Thank you for your time and detailed comment Ugo. I'm excited to hear back
from the original author of the report.
So in your environment, CUDAICA on RTX3070 is about 12 times faster than
BINICA on Ryzen 5 5900X. My comparison of CUDAICA vs. runica is about only
4-5 times of difference--this is probably my GTX1660 is not as fast as
There are other parts in my pipeline that take longer time than others,
including CleanLine (slowest only next to ICA), clean_rawdata (RANSAC in
the electrode rejection stage, ASR), and Dipfit (gradient descent). It's
exciting that you have a solution to speed up the dipole fitting process.
Your 'library of hacks' sound so interesting--one day you should give us an
EEGLAB hacking lecture!
Please keep us posted Ugo, your input is tremendously useful for all of us
in the community!
On Thu, Nov 11, 2021 at 3:48 PM Bruzadin Nunes, Ugo <ugob at siu.edu> wrote:
> Dear Makoto and everyone,
> I am glad these instructions were useful, and I’m sorry to was so hard to
> reproduce my results! I had a really rough time installing CUDAICA on
> windows in the beginning, which is why I made these instructions in the
> first place. Now I also learned how to do it a bit faster/easier, but I
> didn’t take the time to edit my instructions (although, to be fair, I still
> used those instructions when I make new installs). It was painful because I
> didn’t know anything about libraries or APIs at the time, and still CUDAICA
> gets a bit finicky, especially with the icadefs and the .sc defaults.
> In terms of speed, I’ve been using CUDAICA on a RTX3070 comparing it with
> BINICA one my 12 cores AMD Ryzen 5 5900X. I run BINICA in parallel, so I
> run 12 files at a time, in which case it’s a tight race; my files are big
> in continuous mode, which is why I thought CUDAICA was so important and
> still are. For smaller PCAs, binica and cudaica run equally fast, at least
> in my setup, so 12x binica is much faster than 1 cudaica at a time, and I
> am working on a script that would run 11 binicas and 1 cudaica at a time,
> which would save me lots of time.
> My take away is that CUDAICA is extremely useful for running big files in
> continuous mode, and for ICA or large PCAs (50+). In my setup, I can run an
> ICA for a continuous file (5 to 10 min long) using CUDAICA in 20 to 30
> seconds. For the same files, it takes binica about 250 to 300 seconds, and
> the old runica several minutes. I do not have the exact numbers precisely,
> but I do use them almost everyday and that’s about what I remember. I was
> not aware there was a hack for speeding up runica, I’ll look into it!
> If I run 12 files at a time in parallel, 250/12 seconds is almost exactly
> the same speed as 20 seconds per file using CUDAICA, which is why I
> generally run binica instead of cudaica.
> For the sake of science, I’ve tried running CUDAICA in parallel, but it
> does not produce any different, it only slows it down (it wouldn’t make
> sense to work, but I’ve tried it anyway, it runs but it’s way slower).
> In summary, CUDAICA is absolutely useful if you have large continuous
> files in need of large PCAs or ICAs and a good GPU, allowing one to run ICA
> in huge files in as little as 20 seconds. In my old RazerBlade, with a
> GTX970m, cudaica was as fast as binica; But, because one can run 4 files at
> a time in the CPU, is gives the advantage to binica. My guess is that this
> is the case for the majority of the computers, IF you are running a
> pipeline and can run files in parallel. On individual files, with a good
> enough GPU, it’s absolutely worth the time save, especially in manual file
> processing. I just wish it was a bit easier to install.
> I have a library of hacks for speeding up EEGLAB; I run a modified version
> of the 2020 eeglab for our laboratory, which we standardize in all
> computers, and which contains several modifications that significantly
> speed up data processing. I intend to release these changes as a plugin
> sometime soon. This includes a dipfit that runs in parallel, cutting time
> significantly, a partial component and channel interpolator based on
> eegplot_w – which allows for removal of parts of components without the
> removal of the full component - a viewprops+, which allows for quicker
> component selection and removal, and an autopipeliner, which I use to run
> to test my pipelines in parallel and stores all the backups in organized
> folders, using simple commands instead of having to code full scripts every
> Thanks for all the feedback, and I’m pleased that this was a useful tool.
> Again, I am sorry for the extreme level of detail and often unnecessary
> instructions. I didn’t know what worked, so I put all that worked in text
> for future reference. CUDAICA and GPU arrays have a lot of potential in
> saving time, for long processes such as ICA, PCA, BSS, and DIPFIT, and I
> personally enjoy to a fault making scripts run faster!
> Best wishes,
> Ugo Bruzadin Nunes, Ph.D. Candidate
> Visiting Assistant Professor, Psychology
> Webster University
> Office Location: ISB room 316
> Office Number: (314) 968-7677
> Ugo at webster.edu <UgoBruzadinNunes at webster.edu>
> *From: *Makoto Miyakoshi <mmiyakoshi at ucsd.edu>
> *Sent: *Thursday, November 11, 2021 11:56 AM
> *To: *eeglablist at sccn.ucsd.edu
> *Cc: *Bruzadin Nunes, Ugo <ugob at siu.edu>
> *Subject: *Re: [Eeglablist] Installing CUDAICA on Windows 10 (2021 update)
> *[EXTERNAL EMAIL ALERT]: *Verify sender before opening links or
> Dear John,
> Thank you for your comment.
> It was difficult for me partly because I'm not very experienced in
> building an environment, and also because of recent changes in the
> dependent softwares between May 2019 and Nov 2011.
> 1. Microsoft updated Intel Parallel Studio XE to oneAPI which made a
> critical part in Ugo's suggestions no longer valid.
> 2. I did not find 'C:\Users\Ugo\AppData\Local\Programs\Microsoft VS
> Code\bin' just by following Ugo's suggestions. I found that installing Microsoft
> Visual Studio Code is necessary in my case, which may be due to Microsoft's
> update on Windows Visual Studio (but probably this is not a part of
> Ugo posted his solution only 2.5 years ago. Yesterday, I spent 10 hours to
> make it work, which shows fow fast technology is left behind time. And it
> is not because the technology becomes obsolete but because it becomes a
> lost technology due to software updates.
> > We use the 48core computers for the runica, but it does not appear to
> profit from the multiple CPUs.
> You'll definitely benefit from running AMICA using all the cores! It may
> not be as fast as CUDAICA, but AMICA has some nice extra features including
> auto data rejection, time-series data of model's log likelihood, etc.
> On Wed, Nov 10, 2021 at 10:14 PM Richards, John <RICHARDS at mailbox.sc.edu>
> Re CUDAICA. I was able to install it, i don't remember it being that
> difficult. I had to mess around with the CUDA version.
> I have found it "blazing" fast compared to runica. I have not timed it.
> We have 10-15 min sessions with EGI 128, 250 hz, do the Prep pipeline to
> get avg ref, and then CUDAICA. It takes < 5 min to do the Prep, and < 5
> min to do the CUDAICA; cf 45 min to 60 min with runica. I may not be using
> the most recent runica. BTW, we have fairly powerful computers; we use 48
> cores for the Prep pipeline which is a vast speedup, and V100's with 16gb
> or 32gb. Definitely not bargain chips. We use the 48core computers for
> the runica, but it does not appear to profit from the multiple CPUs. The
> Prep pipeline also is very slow on single CPUs, but very fast on the 48 CPU
> I would be glad to share more details if anyone is interested.
> John E. Richards
> Carolina Distinguished Professor
> Department of Psychology
> University of South Carolina
> Columbia, SC 29208
> Dept Phone: 803 777 2079
> Fax: 803 777 9558
> Email: richards-john at sc.edu
> -----Original Message-----
> From: eeglablist <eeglablist-bounces at sccn.ucsd.edu> On Behalf Of Makoto
> Miyakoshi via eeglablist
> Sent: Thursday, November 11, 2021 1:02 AM
> To: EEGLAB List <eeglablist at sccn.ucsd.edu>; ugob at siu.edu
> Subject: [Eeglablist] Installing CUDAICA on Windows 10 (2021 update)
> Dear list members,
> I summarized the steps to install cudaica() which uses GPU computation to
> calculate infomax ICA (Raimondo et al., 2012). The result from the speed
> comparison between runica() and cudaica() was not as dramatic as x25
> reported by the original paper, probably because Tjerk's smart hack alone
> already gave x4-5 speed up to runica(). Still, using a relatively cheap
> GTX1660 (the pre-COVID price range is $250), I confirmed x4-5 speed up
> compared with runica(). The detailed instruction can be found in the
> following link.
> WARNING: The installation was difficult.
> Eeglablist page: http://sccn.ucsd.edu/eeglab/eeglabmail.html
> To unsubscribe, send an empty email to
> eeglablist-unsubscribe at sccn.ucsd.edu
> For digest mode, send an email with the subject "set digest mime" to
> eeglablist-request at sccn.ucsd.edu
More information about the eeglablist