0
votes

I have a data.frame like so:

category count
A        11
B        1
C        45
A        1003
D        20
B        207
E        634
E        40
A        42
A        7
B        44
B        12

Each row represents a specific element with a category type and a count of that element. I would like to produce a frequency distribution of counts per category, but the categories are at the moment redundant.

How do I retrieve a table of redundant category counts? i.e. I want a table that looks like:

category count
A        11234
B        4005
C        100023
D        65567
E        54654
...      ...

I almost got there using lapply:

df.nrcounts <- lapply(unique(df.counts$category), 
  function(x) c(category=x, count=sum(subset(df.counts, category==x)$count)))

but I can't seem to coerce the output to a proper dataframe. I can't quite get my head around using the function.

1

1 Answers

2
votes
aggregate(df.counts$count,by=list(df.counts$category),FUN=sum)

Or

library(data.table)
setDT(df.counts)[, list(count=sum(count)), by = category]