1
votes

I need an outer join of ffdf dataframes saved in a list. Have checked this, but it refers to a different problem. Example code for RAM objects:

x1 = data.frame(name='a1', Ai=2, Ac=1, Bi=1)
x2 = data.frame(name='a2', Ai=1, Bi=3, Bc=1, Ci=1)
x3 = data.frame(name='a3', Ai=3, Ac=2, Bi=2, Ci=3, Cc=1, Di=2, Dc=2)
x4 = data.frame(name='a4', Ai=3, Bi=2, Ci=1, Fi=2)
dl = list(x1,x2,x3,x4)
mergedDF = Reduce(function(...) merge(..., all=T), dl)
mergedDF[is.na(merged.data.frame)] = 0

Desired result looks like:

mergedDF
  name Ai Bi Ci Ac Bc Cc Di Dc Fi
1   a1  2  1  0  1  0  0  0  0  0
2   a2  1  3  1  0  1  0  0  0  0
3   a3  3  2  3  2  0  1  2  2  0
4   a4  3  2  1  0  0  0  0  0  2

As long as I turn the data frames to ffdf though, I get the error

Error in merge.ffdf(..., all = T) : merge.ffdf only allows inner joins

Any known workrounds? Many thanks in advance.

1
If I understand your question correctly. The development version of ffbase contains a function called ffdfrbind.fill (similar as rbind.fill). library(devtools); install_github("edwindj/ffbase", subdir="pkg") will install that development version. Normally ffdfrbind.fill(x1, x2, x3, x4) will get you there. - user1600826
rbind.fill functionality is what is needed indeed. Unfortunately I get this error when I try install_github("edwindj/ffbase", subdir="pkg") : ERROR: compilation failed for package 'ffbase' - Audrey
I believe you are working on windows. If you want to install the package from source as is done with install_github, you need to have Rtools installed. Do you have Rtools installed? cran.r-project.org/bin/windows/Rtools - user1600826
Warning message: package ‘Rtools’ is not available (for R version 3.0.2) - Audrey
Is Rtools in your path. Maybe you need to restart your computer before it is in your path? - user1600826

1 Answers

1
votes

This post helped me Combine two data frames by rows (rbind) when they have different sets of columns. So to do a similar thing with yours:

   install.packages('plyr')
   require(plyr)
   answer <- Reduce(rbind.fill,dl)
   answer[is.na(answer)] <- 0
   answer

  name Ai Ac Bi Bc Ci Cc Di Dc Fi
1   a1  2  1  1  0  0  0  0  0  0
2   a2  1  0  3  1  1  0  0  0  0
3   a3  3  2  2  0  3  1  2  2  0
4   a4  3  0  2  0  1  0  0  0  2

BTW nice thought with Reduce, that's a nifty little function that rarely (at least for me) gets used.