Montag, 4. November 2013

How to operate in parallel

Doing parallel calculations will spare you much time... There are three points deserving our attention. %dopar% does parallel calculation, the combining function after having the work done can be defined, either c or rbind or whatever and - last but not least - the packages used within the loop must be defined in the specific parameter.

library(doParallel)
cl <- makeCluster(3)   # the number of cores to be used
registerDoParallel(cl)
getDoParWorkers()      # are they ready?


# remind defining packages if they're used within the loop
res <- foreach(i=1:5, .combine=c, .packages="DescTools") %dopar% {
  Primes(i)

res

stopCluster(cl)       # release your slaves again


Boot already has native support for parallel working:

library(boot)
slopeFun <- function(df, i) {
  #df must be a data frame. 

  #i is the vector of row indices that boot will pass 
  xResamp <- df[i, ]
  slope <- lm(hp ~ cyl + disp, data=xResamp)$coef[2]
}


ptime <- system.time({
  b <- boot(mtcars, slopeFun, R=50000, ncpus=6, parallel="snow")
})[3]

ptime