
I am new at using R and I am trying to produce a histogram where the axis shows a sum of another column rather than just a frequency count.

I have a matrix with two columns, RATE and BALANCE. I would like to produce a histogram that shows the sum of balance rather than just record count.

hist(mydata$RATE) #only shows frequency. How do i get it to sum mydata$BALANCE

I would like to produce a histogram that sums the BALANCE column rather than just doing a record count. something like hist(mydata$RATE, mydata$BALANCE) but obviously the hist function doesn't appear to take a sum parameter


2 Answers


It sounds like you're trying to plot a bar plot. The corresponding function barplot might help.

First, as suggested by @DWin, create some reproducible data:

set.seed(1) # Sets the starting seed for pseudo-random number generation
mydata <- data.frame(RATE = sample(LETTERS[1:5], 100, replace = TRUE),
  BALANCE = rpois(100, 15) * 10)

Then create the summary data using the function tapply. This will calculate the sum of your BALANCE variable over each value of your RATE variable.

plotdata <- tapply(mydata$BALANCE, mydata$RATE, FUN = sum)

Then plot that using barplot:


enter image description here


That's not an example that we can test to see if it meets your expectations, (and it's also not exactly clear what you do mean by "taking a sum parameter"), but try this:

hist( cumsum( mydata$BALANCE) )