0
votes

I have 2 lists of dataframes (df1 and df2). The lists and dataframes are identical in their classes, and 4 columns, but are not identical in the number of rows or data they contain:

df1.1=

Col1 Col2 Col3 Col4

chr  chr  num  num

df2.1 =

Col1 Col2 Col3 Col4

chr  chr  num  num 

etc for 100 dataframes per list.

They are stored in a master list of dataframes (df1) that I use lapply functions on.

I want to reduce all the dataframes in each list to one dataframe. To do this I used:

reducedf1<-df1 %>% reduce (full_join);
reducedf2<-df2 %>% reduce(full_join);

For df1 it worked. For df2 it did not. The error given was:

Error in full_join_impl(x,y, by_x, by_y, aux_x, aux_y, na_matches) : Can't join on 'Col2' x 'Col2' because of incompatible types (integer/character). 

To fix this I tried to check if there were dataframes inside the list that did not have the same class and correct them:

testingfunction<-function(x){
col_to_change=x[,2];
class(col_to_change)<-"character";
x}

mutatedf2<-lapply(df2, testingfunction)

I checked- still has class character in every dataframe, but does not join- same error.

If my dataframes all have the same classes and they were created in the same way and put in the same list- why in one dataframe would it work and one it would not? What would be a way to solve this error so I can merge the dataframes in the list to one large dataframe using reduce and full_join?

1
I'm not sure what you think col_to_change=[x,2]; is doing why not just have class(x[[2]])<-"character"; x in your function. Better yet why don't you run sapply(df2, function(x) class(x[[2]])) to find out which dataframe have a different class so you can actually check what is causing the discrepency - Sarah
class(x[[2]])<-"character"; x will only tell me about the 2nd dataframe in a list, not the second column in the list. - MGru
According to the error- it's Col2 that has the issue... - MGru
I do see that I have an error in syntax there- should be col_to_change = x[,2] - MGru
‘class(x[[2]]) <- as.character’ would only tell you about the second data.frame in the list except you are calling it from an lapply which is acting on each data.frame in the list in turn. But using an inbuilt function such as transform as you have below is a better idea anyway. - Sarah

1 Answers

0
votes

I have no idea why it happened in one dataframe and not another but I solved it: Editdf2<-lapply(df2, transform, Col2 = as.character(Col2)); And then ran the reduce with no issue.

I assume 1 dataframe out of the list in Col2 had one issue with an integer instead of a character but there was no way to know that in advance.