147
votes

I'm using R and ggplot to draw a scatterplot of some data, all is fine except that the numbers on the y-axis are coming out with computer style exponent formatting, i.e. 4e+05, 5e+05, etc. This is unacceptable to me, so I want to get it to display them as 500,000, 400,000, and so on. Getting a proper exponent notation would also be acceptable.

The code for the plot is as follows:

p <- ggplot(valids, aes(x=Test, y=Values)) +
  geom_point(position="jitter") +
  facet_grid(. ~ Facet) +
  scale_y_continuous(name="Fluorescent intensity/arbitrary units") +
  scale_x_discrete(name="Test repeat") +
  stat_summary(fun.ymin=median, fun.ymax=median, fun.y=median, geom="crossbar")

Any help much appreciated.

5
Be careful of describing ggplot default options as "obviously unacceptable". You mean you have a personal preference for a different format. A number in the format 4e+05 is scientific notation, and would be the preferred formatting in a wide variety of applications.Andrie
4e+05 is not scientific notation, it is a computer approximation to scientific notation. It would not be acceptable in any print journal I can think of, so I consider it unacceptable for my dissertation.Jack Aidley

5 Answers

141
votes

Another option is to format your axis tick labels with commas is by using the package scales, and add

 scale_y_continuous(name="Fluorescent intensity/arbitrary units", labels = comma)

to your ggplot statement.

If you don't want to load the package, use:

scale_y_continuous(name="Fluorescent intensity/arbitrary units", labels = scales::comma)
72
votes

I also found another way of doing this that gives proper 'x10(superscript)5' notation on the axes. I'm posting it here in the hope it might be useful to some. I got the code from here so I claim no credit for it, that rightly goes to Brian Diggs.

fancy_scientific <- function(l) {
     # turn in to character string in scientific notation
     l <- format(l, scientific = TRUE)
     # quote the part before the exponent to keep all the digits
     l <- gsub("^(.*)e", "'\\1'e", l)
     # turn the 'e+' into plotmath format
     l <- gsub("e", "%*%10^", l)
     # return this as an expression
     parse(text=l)
}

Which you can then use as

ggplot(data=df, aes(x=x, y=y)) +
   geom_point() +
   scale_y_continuous(labels=fancy_scientific) 
45
votes
x <- rnorm(10) * 100000
y <- seq(0, 1, length = 10)
p <- qplot(x, y)
library(scales)
p + scale_x_continuous(labels = comma)
16
votes

I'm late to the game here but in-case others want an easy solution, I created a set of functions which can be called like:

 ggplot + scale_x_continuous(labels = human_gbp)

which give you human readable numbers for x or y axes (or any number in general really).

You can find the functions here: Github Repo Just copy the functions in to your script so you can call them.

12
votes

I find Jack Aidley's suggested answer a useful one.

I wanted to throw out another option. Suppose you have a series with many small numbers, and you want to ensure the axis labels write out the full decimal point (e.g. 5e-05 -> 0.0005), then:

NotFancy <- function(l) {
 l <- format(l, scientific = FALSE)
 parse(text=l)
}

ggplot(data = data.frame(x = 1:100, 
                         y = seq(from=0.00005,to = 0.0000000000001,length.out=100) + runif(n=100,-0.0000005,0.0000005)), 
       aes(x=x, y=y)) +
     geom_point() +
     scale_y_continuous(labels=NotFancy)