I want to calculate the most frequent value of a categorical variable. I tried using the mlv function in the modeest package, but getting NAs.
user <- c("A","B","A","A","B","A","B","B")
color <- c("blue","green","blue","blue","green","yellow","pink","blue")
df <- data.frame(user,color)
df$color <- as.factor(df$color)
library(plyr)
library(dplyr)
library(modeest)
summary <- ddply(df,.(user),summarise,mode=mlv(color,method="mlv")[['M']])
Warning messages:
1: In discrete(x, ...) : NAs introduced by coercion
2: In discrete(x, ...) : NAs introduced by coercion
summary
user mode
1 A NA
2 B NA
Whereas, I need this:
user mode
A blue
B green
What am I doing wrong? I tried using other methods, as well as just mlv(x=color)
. According to the help pages of modeest, it should work for factors.
I don't want to use table(), as I need a simple function that I can use to create a summary table like in this question: How to get the mode of a group in summarize in R ,but for a categorical column.