2
votes

I want to graph a histogram for my data. I 'm strugling with two issues.

First , how to separate the bars (frequency) for each break value. In other words, I have set breaks for X-axis in a log scale and I want to graph only for these breaks.... I don't want continues histogram bars (next each other), I want gaps between them....

Second, I 'm wondering how to apply condition to breaks. For example I have breaks=c(0.1,0.2,0.5,1,2,5,10,30,40), how to add one break as a condition something like breaks=c(0.1,0.2,0.5,1,2,5,10,30,40, "any value > 40").

Here is my data:

structure(list(Time = c(0.08618, 0.086591, 0.086752, 0.18448, 
0.093463, 0.092634, 0.087419, 0.087307, 0.085734, 0.085272, 0.18448, 
0.085154, 0.085021, 0.084936, 0.091301, 0.177737, 0.18448, 0.089677, 
0.084906, 0.08614, 0.194328, 0.10183, 0.086494, 0.088581, 0.089195, 
0.089914, 0.090335, 0.086295, 0.086589, 0.10714, 0.265871, 0.315305, 
0.251465, 0.167559, 0.828143, 0.19883, 0.16173, 0.297092, 0.199025, 
0.196639, 0.20123, 0.206766, 0.205378, 0.490892, 0.226212, 11.197049, 
3.215287, 0.201566, 8.732194, 1.890716, 0.589986, 15.215162, 
0.196188, 0.219697, 9.816025, 0.290359, 0.233825, 3.230766, 4.605698, 
0.804751, 0.41611, 0.51733, 9.318433, 0.812274, 0.41187, 9.843202, 
0.607423, 0.823639, 932, 0.243041, 0.309908, 929, 0.70039, 0.706538, 
9.848918, 0.427812, 2.213476, 923, 3.428199, 921, 6.247575, 1.007718, 
918, 0.628396, 0.156748, 800, 914, 900, 890, 850, 650)), .Names = "Time", row.names = c(NA, 
-91L), class = "data.frame")

here is my code:

 ggplot(DF, aes(x =Time))+
 geom_histogram(bin=0.1,position = "dodge", colour = "black", fill = "white")+
 scale_x_log10(breaks=c(0.1,0.2,0.5,1,2,5,10,20,30,40),expand=c(0.005,0.1))+
 scale_y_continuous(expand=c(0.04,0.3))

Below is what I'm getting...

enter image description here

Update: I want get something like : enter image description here

I know this is bar-plot... However, I got this plot from excel where it automatically calculates the histogram for a range of bins. I was looking to do the whole thing in ggplot...
Any suggestions!!!

2
If you want gaps you need to use geom_bar instead, and that will require you to calculate the bins and the counts manually. - Christie Haskell Marsh
I got the graph that I want in Excel, was hoping to get it in ggplot2 !! Excel does the manual count. - SimpleNEasy
If you post an image of you Excel graph I can see if I can re-create it using ggplot. - Christie Haskell Marsh
Basically, if you use excel, it allows you to set range of bins and counts the frequency for each bin. Graphing the results from the excel in ggplot would be simple. I was hoping to do it all in ggplot instead of taking the results from excel and do bar plot in ggplot. I guess no other option!! unless there is a way for setting a range of bins which I'm not aware of !! - SimpleNEasy
It would be much easier if you just posted an image of the plot you want to create. - Christie Haskell Marsh

2 Answers

1
votes

As far as I know, you can't have gaps between the bar of a histogram in ggplot2.

For your second question, this code:

ggplot(df, aes(x = Time))+
  geom_histogram(binwidth = 0.1, colour = "black", fill = "white")+
  scale_x_log10(breaks = c(0.1,0.2,0.5,1,2,5,10,20,30,40,100),
                labels = c("0.1","0.2","0.5","1","2","5","10","20","30","40","> 100"),
                expand = c(0.005,0.1))+
  scale_y_continuous(expand = c(0.04,0.3))

gives this result: enter image description here

1
votes

This using your original breaks. I just calculated the counts manually.

brks<-c(0.1,0.2,0.5,1,2,5,10,30,40,"more")

count<-rep(1,10)
count[1]<-length(DF[which(DF$Time<=0.1),])
count[2]<-length(DF[which(DF$Time>0.1 & DF$Time<=0.2),])
count[3]<-length(DF[which(DF$Time>0.2 & DF$Time<=0.5),])
count[4]<-length(DF[which(DF$Time>0.5 & DF$Time<=1),])
count[5]<-length(DF[which(DF$Time>1 & DF$Time<=2),])
count[6]<-length(DF[which(DF$Time>2 & DF$Time<=5),])
count[7]<-length(DF[which(DF$Time>5 & DF$Time<=10),])
count[8]<-length(DF[which(DF$Time>10 & DF$Time<=30),])
count[9]<-length(DF[which(DF$Time>30 & DF$Time<=40),])
count[10]<-length(DF[which(DF$Time>40),])

data<-data.frame("breaks"=brks,"count"=count)

ggplot(data,aes(x=breaks,y=count))+
  geom_bar(stat="identity")+
  scale_x_discrete(limits=c(0.1,0.2,0.5,1,2,5,10,30,40,"more"))

enter image description here

EDIT: Here's the plot with all of the options from your first attempt:

ggplot(data,aes(x=breaks,y=count))+
  geom_bar(stat="identity",colour = "black",fill = "white")+
  scale_x_log10(breaks=c(0.1,0.2,0.5,1,2,5,10,30,40,600),
                labels = c("0.1","0.2","0.5","1","2","5","10","30","40","> 600"),
                expand=c(0.005,0.1))+
  scale_y_continuous(expand=c(0.04,0.3))

enter image description here

EDIT2: Wider plot to put distance between 30 and 40

enter image description here