Quantcast
Channel: Search Results for “pca”– R-bloggers
Viewing all articles
Browse latest Browse all 209

read raster data in parallel

$
0
0

(This article was first published on metvurst, and kindly contributed to R-bloggers)
Use library(parallel) to read raster data in parallel fashion

Use library(parallel) to read raster data in parallel fashion

Recently, I have been doing some analysis for a project I am involved in. In particular, I was interested what role pacific sea surface temperatures play with regard to rainfall in East Africa. I spare you the details as I am currently writing all this up into a paper which you can have a look at once published.

For this analysis, however, I am processing quite an amount of raster files. This led me to investigate the possibilities of the parallel package to speed up the process.

Here's a quick example on how to read in raster data (in this case 200 global sea surface temperature files of 1° x 1° degree resolution) using parallel

First, lets do it the conventional way and see how long that takes

system.time({

library(raster)
library(rgdal)

### Input preparation
### ########################################################
inputpath <- "E:/sst_kili_analysis/"
ptrn <- "*sst_anom_pcadenoise_*_R2.rst"

### list files in direcotry
### ##################################################
fnames_sst_r2 <- list.files(inputpath, pattern = glob2rx(ptrn), recursive = T)

### read into raster format
### ##################################################
sst.global <- lapply(seq(fnames_sst_r2), function(i) {
raster(paste(inputpath, fnames_sst_r2[i], sep = "/"))
})
})
##    user  system elapsed 
## 31.37 4.43 36.50

Now using library(parallel)

library(parallel)
system.time({

### Input preparation
### ########################################################
inputpath.p <- "E:/sst_kili_analysis/"
ptrn.p <- "*sst_anom_pcadenoise_*_R2.rst"

### list files in direcotry
### ##################################################
fnames_sst_r2.p <- list.files(inputpath.p, pattern = glob2rx(ptrn.p), recursive = T)

### set up cluster call
### ######################################################
cl <- makePSOCKcluster(4)

clusterExport(cl, c("inputpath.p", "fnames_sst_r2.p"))
junk <- clusterEvalQ(cl, c(library(raster), library(rgdal)))

### read into raster format using parallel version of lapply
### #################
sst.global.p <- parLapply(cl, seq(fnames_sst_r2.p), function(i) {
raster(paste(inputpath.p, fnames_sst_r2.p[i], sep = "/"))
})

### stop the cluster
### #########################################################
stopCluster(cl)
})
##    user  system elapsed 
## 1.40 3.03 13.34

Not that much of a speed enhancement, but we need to keep in mind that the raster command does not read into memory. Hence, the speed improvements should be a lot higher once we start the calculations or plotting.

Finally, let's test whether the two methods produce identical results.

identical(sst.global.p, sst.global)
## [1] TRUE

to be continued…

sessionInfo()
## R version 2.15.1 (2012-06-22)
## Platform: x86_64-pc-mingw32/x64 (64-bit)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] rgdal_0.7-12 raster_1.9-92 sp_0.9-99 knitr_0.6.3
##
## loaded via a namespace (and not attached):
## [1] digest_0.5.2 evaluate_0.4.2 formatR_0.5 grid_2.15.1
## [5] lattice_0.20-6 parser_0.0-16 plyr_1.7.1 Rcpp_0.9.13
## [9] stringr_0.6 tools_2.15.1

To leave a comment for the author, please follow the link and comment on his blog: metvurst.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Viewing all articles
Browse latest Browse all 209

Trending Articles