1
votes

I have a data table df, I would like to group it by the first column, 'location', and compute the frequency of entries across all table, both columns and rows:

df

location   NN_1    NN_2   NN_3
    NYC    17      17      17
    NYC    17      16      1
    LA     1        1      10
    LA     16      10      1

which can be obtained by:

df <- structure(list(location = c("NYC", "NYC", "LA", "LA"), 
                     NN_1 = c(17, 17, 1, 16), 
                     NN_2 = c(17, 16, 1, 10), 
                     NN_3 = c(17, 1, 10, 1)),
                     class = "data.frame", 
                     row.names = c(NA, -4L))

I would like to count how many times 17 is repeated for a given location, like NYC:

output
location   NNs  freq
    NYC    17      4
    NYC    16      1
    NYC     1      1
    LA      1      3
    LA      16     1
    LA      10     2
2

2 Answers

2
votes

Possible tidyverse solution:

df %>% 
  gather("key", "NNs", 2:ncol(.)) %>% 
  group_by(location, NNs) %>% 
  summarize(freq = n()) %>% 
  arrange(desc(location), desc(NNs))
2
votes

Base R solution, effectively doing the same thing as the tidyverse solution. Put the first location column identifier against all the other columns, then tabulate:

as.data.frame(table(cbind(df[1], NNs=unlist(df[-1]))))
#  location NNs Freq
#1       LA   1    3
#2      NYC   1    1
#3       LA  10    2
#4      NYC  10    0
#5       LA  16    1
#6      NYC  16    1
#7       LA  17    0
#8      NYC  17    4