Notes on bioinformatics and data mining by G. Corey Shan
Running R with parallel under linux
1 min read
R is a free, powerful programming language and software environment for statistical computing and graphics supported by the R Foundation.
Recently, I’ve been dealing with parallel processing in R and have found the foreach and doMC packages to be useful approaches in increasing efficiency of loops.
The doMC package is a “parallel backend” for the foreach package. It provides a mechanism needed to execute foreach loops in parallel. Usually, we’d better check the maximum number of threads(It’s 96 for me) first by using detectCores.
Then configure a smaller number for subsequent parallel processing (I set 20, which means 20 threads will be reserved for my task).
In this task, I select and record good models that satisfying certain conditions (Sp > 0.88, Se > 0.7).
This small change shrunk my time from several hours to less than five minutes, what a lovely change.