9
votes

I am using ggplot to plot a histogram where the x variable is a categorical variable and I want to change the x-axis tick labels. Here is my code:

from pandas import *
from ggplot import *

df = pandas.read_csv('C:\Users\...csv')

def plot_data(df):

    plot = ggplot(data_by_group, aes('x', 'y')) +
           geom_histogram(stat='bar') + ggtitle('title') + 
           xlab('x-label') + ylab('y-label')  
    #x_ticklabels = ['a', 'b', 'c']    
    return plot

I would like to use the x_ticklabels on the x-axis instead of the numbers from the categorical variable.

Any ideas on how to do this?

Thank you

2

2 Answers

3
votes

There is a good example here (under "Setting tick mark labels") showing how to do this. Briefly, given a ggplot "bp", you can control the actual tick labels by setting labels for each category you have in your data like this:

bp + scale_x_discrete(breaks=c("ctrl", "trt1", "trt2"),
                  labels=c("Control", "Treat 1", "Treat 2"))

So in your case, I would imagine you would do something in the lines of

plot = ggplot(data_by_group, aes('x', 'y')) +
       geom_histogram(stat='bar') + ggtitle('title') + 
       xlab('x-label') + ylab('y-label') +
       scale_x_discrete(breaks=c(1, 2, 3),
                  labels=c("a", "b", "c"))
1
votes

Change the answer from MeloMCR as below to make it work:

    plot = ggplot(data_by_group, aes('x', 'y')) +
   geom_histogram(stat='bar') + ggtitle('title') + 
   xlab('x-label') + ylab('y-label') +
   scale_x_discrete(breaks=c(1, 2, 3),
              labels=c("a", "b", "c"))

to

    plot = ggplot(data_by_group, aes('x', 'y')) +
   geom_histogram(stat='bar') + ggtitle('title') + 
   xlab('x-label') + ylab('y-label') +
   scale_x_discrete(breaks= [1, 2, 3],
              labels= ["a", "b", "c"])

column specification (c(a1,a2,a3,....) is not identified by python.