3
votes

I've seen several questions about the order of x axis marks but still none of them could solve my problem. I'm trying to do a density plot which shows the distribution of people by percentile within each score given like this

library(dplyr); library(ggplot2); library(ggtheme)
ggplot(KA,aes(x=percentile,group=kscore,color=kscore))+
  xlab('Percentil')+ ylab('Frecuencia')+ theme_tufte()+ ggtitle("Prospectos")+
  scale_color_brewer(palette = "Greens")+geom_density(size=3)

but the x axis mark gets ordered like 1,10,100,11,12,..,2,20,21,..,99 instead of just 1,2,3,..,100 which is my desired output enter image description here

I fear this affects the whole plot not just the labels

1
Your x variable is a factor. You probably want it to be numeric. KA$percentile = as.numeric(as.character(KA$percentile)).Gregor Thomas
adding a dput(head(KA)) would help confirm thisjeremycg
...but when your sorting is clearly alphabetical "1, 10, 100, 11, 12, ..., 2", confirmation is hardly necessary.Gregor Thomas
And with 100 levels dput(droplevels(head(KA))) would be better.Gregor Thomas
Because the x axis is ordered alphabetically you can check this post: stackoverflow.com/questions/12774210/…mpalanco

1 Answers

1
votes

I'll turn my comment to an answer so this can be marked resolved:

Your x variable is (almost certainly) a factor. You probably want it to be numeric.

KA$percentile = as.numeric(as.character(KA$percentile))

When you're seeing weird stuff, it's good to check on your data. Running str(KA) is a good way to see what's there. If you just want to see classes, sapply(KA, class) is a nice summary.

And it's a common R quirk that if you're converting from factor to numeric, go by way of character or you risk ending up with just the level numbers:

year_fac = factor(1998:2002)
as.numeric(year_fac) # not so good
# [1] 1 2 3 4 5
as.numeric(as.character(year_fac)) # what you want
# [1] 1998 1999 2000 2001 2002