So, I used dcast() on a dataframe last time in which one column was ID and the other multiple codes assigned per id. My df1 looked like this:
ID codes gfreq
123 FGV34 0.988
123 FGV34 0.988
123 FGV34 0.988
566 WER45 na
566 FGV34 0.988
566 FGV34 0.988
in order to manipulate the above format into :
ID FGV34 WER45
123 1 0
566 1 1
dcast(df1, ID ~ codes)
And it had worked perfectly. Now, i have a similar dataframe df2, which has just TWO columns, ID and codes.
ID codes
123 FGV34
123 FGV34
123 FGV34
566 WER45
566 FGV34
566 FGV34
When I run it into dcast: 1. I get a warning about Value.var being overridden and codes column is used as value.var which is okay 2. The format in which I am getting the output is completely different this time.
ID FGV34 WER45
123 FGV34 NA
566 FGV34 WER45
I have checked the datatypes of the attributes in df1 and df2. They are the same for both ID and codes. I want help in getting the output like before with either 0 or 1 instead of NA and column name. Secondly, I want to understand what changed for the dcast() to be behaving differently.
tidr
anddplyr
solution might help. Try thisdf %>% filter(!is.na(Codes)) %>% spread(Codes, ID)
– deepseefanCodes
. – deepseefan