R - Different approach to speed up 3 dimension array/matrix creation -
my question 1 of approach. using iterated through methods create 3 dimension array in r (this first question; r constraint). use case final array needs updated 2 input arrays updated @ different periods. goal minimize final array creation time, intermediary steps if possible.
i know can reach out rcpp, , assign more need readability, wondering is:
is there better approach completing operation?
if (!require("geosphere")) install.packages("geosphere") #simulate real data dimlength <- 418 latlong <- cbind(rep(40,418),rep(2,418)) potentialchurn <- as.matrix(rep(500,418)) #create 2d matrix valuemat <- matrix(0,dimlength,dimlength) value <- potentialchurn valuetranspose <- t(value) (s in 1:dimlength){valuemat[s,] <- value + valuetranspose[s]} diag(valuemat) <- 0 #create 3d matrix copying 2d matrix bigvalmat <- array(0,dim=c(dimlength,dimlength,dimlength)) (d in 1:dimlength){bigvalmat[,d,] <- valuemat} #get crow fly distance between locations, create 2d matrix distmat <- as.matrix(outer(seq(dimlength), seq(dimlength), vectorize(function(i, j) distcosine(latlong[i,], latlong [j,])))) ###create 3d matrix calculating distance between 2 locations; # create 2d matrix each column in original 2d matrix # add column-replicated 2d matrix original bigdistmat <- array(0,dim=c(dimlength,dimlength,dimlength)) (p in 1:dimlength){ addcol <- distmat[,p] addmatrix <- as.matrix(addcol) (y in 2:dimlength) {addmatrix <- cbind(addmatrix,addcol)} bigdistmat[,p,] <- data.matrix(distmat) + data.matrix(addmatrix)} #final matrix calculation bigvaldistmat <- bigvalmat / bigdistmat
...as context part of 2 step ahead forecast policy developed class using barcelona bikesharing (bicing) data. project on , interested how have done better.
in general if want speed code want identify bottle necks , fix them explained here. putting code before hand in function idea.
in specific case, use loops r code. need vectorize code more.
edit long answer:
#simulate real data, want them random dimlength <- 418 latlong <- cbind(rnorm(dimlength,40,0.5),rnorm(dimlength,2,0.5)) potentialchurn <- as.matrix(rnorm(dimlength,500,10)) #create 2d matrix, outer designed operation valuemat <- outer(value,t(value),fun="+")[,1,1,] diag(valuemat) <- 0 # create 3d matrix copying 2d matrix, again, avoid loop bigvalmat <- array(rep(valuemat,dimlength),dim=c(dimlength,dimlength,dimlength)) # , use aperm permute dimensions bigvalmat <- aperm(bigvalmat2,c(1,3,2)) #get crow fly distance between locations, create 2d matrix # other packages available compute kind of distance matrix # let's stay in plain r # wordy faster (and easier read) longs1 <- rep(latlong[,1],dimlength) lats1 <- rep(latlong[,2],dimlength) latlong1 <- cbind(longs1,lats1) longs2 <- rep(latlong[,1],each=dimlength) lats2 <- rep(latlong[,2],each=dimlength) latlong2 <- cbind(longs2,lats2) distmat <- matrix(distcosine(latlong1,latlong2),ncol=dimlength) ###create 3d matrix calculating distance between 2 locations; # same logic bigvalmat addmatrix <- array(rep(distmat,dimlength),dim=rep(dimlength,3)) distmat3d <- aperm(addmatrix,c(1,3,2)) bigdistmat <- addmatrix + distmat3d #get crow fly distance between locations, create 2d matrix #final matrix calculation bigvaldistmat <- bigvalmat / bigdistmat
here 25x faster initial code (76s -> 3s). still improved got idea: avoid for
, cbind
and co @ costs.
Comments
Post a Comment