parallel processing - rfsrc() command in randomForestSRC package R not using multi core functionality -


i using r (for windows 7, 32 -bit) doing text classification using randomforests. due large dataset, looked internet speeding model-building , came across randomforestsrc package.

i have followed steps in installation manual package, yet during execution of rfsrc() command, 1 of logical cores used r (same randomforest()), maximum cpu utilization being 25%. have used following command per manual.

options(mc.cores=detectcores()-1, rf.cores = detectcores()-1) 

i using windows 7 professional 32 bit service pack 1, on intel i3 2120 cpu 4 logical cores. throw light on missing? other efficient way use randomforest multicore utilization helpful!

the problem randomforestsrc uses mclapply function parallel execution, mclapply doesn't support parallel execution on windows. randomforestsrc can use openmp multithreaded parallel execution, isn't built binary distribution cran, have build package source openmp support enabled.

i think 2 options are:

  • build randomforestsrc openmp support on windows machine;
  • call random forest function in parallel yourself.

here's simple parallel example using randomforest package foreach , doparallel derived example in foreach vignette:

library(randomforest) library(doparallel) workers <- detectcores()  cl <- makepsockcluster(workers) registerdoparallel(cl)  x <- matrix(runif(500), 100) y <- gl(2, 50) ntree <- 1000  rf <- foreach(n=rep(ceiling(ntree/workers), workers),               .combine=combine, .multicombine=true,               .packages='randomforest') %dopar% {   randomforest(x, y, ntree=n) } 

this example should work on windows, mac os x , linux. see foreach vignette more information.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -