0
votes

I need to create a frequency table by extracting multiple variables from another dataframe.

For example I have a dataframe with the following info

Product Result Location Source Year Month

I want to create a frequency table based on positive detection (i.e. result 1 ) similar to a pivot table in excel. For example

Source Product Location 2008 2009 2010 2011

where the freq of detection for each year is calculated based on total count of the category.

The end result is I want to plot a facet grid based on x= year, y = freq, color = source, facet grid = sample

I could do the ggplot using count of 1s but how do i do it using freq instead so that the denominator can be taken into account?

My actual data has 20,000+ rows.

sorry i couldn't add in the codes somehow.

Thank you.

1
I'd just do a simple count using dplyr's group_by function. Put as many grouping fields in as you need. An example with the iris data set: df %>% group_by(Species, Petal.Width) %>% summarise(COUNT = n()) %>% ungroup() %>% mutate(PERCENT = COUNT/sum(COUNT))Ryan Morton
Thanks Ryan. I am very new to r and need more help from you. In your ans, does summarise (count = n()) calculate the count for the 'species' i.e the first input? How do i select those meeting the criteria of 'result =1' from another column? What do i put into ungroup()?Sherilyn
n() counts every instance of the combination (each species-petal.width combination). ungroup() removes the grouping so you can do calculations that should not be done on a group by basis. If you want to select only counts that equal 1, you could replace the mutate() with filter(COUNT == 1),Ryan Morton
Because i need to find the Freq of counts that equal to 1. Can i write: Df %>% group_by(species,petal.Width, location,year) %>% summarise (COUNT = n()) %>% mutate (Percent = (COUNT/sum(COUNT)) %>% filter (COUNT ==1)?Sherilyn
Nope, you're getting the procedure mixed up. If you need the percent of counts==1, I'd just do df %>% group_by(Species, Petal.Width) %>% summarise(COUNT = n()) %>% ungroup() %>% mutate(PERCENT = COUNT/sum(COUNT)) %>% filter(COUNT == 1) %>% summarise(COUNT_1 = sum(PERCENT)Ryan Morton

1 Answers

0
votes

Let's just do this as an answer and see if it's easier to read and work out. I run two tables to check my work. The first calculates the percent of all cases where the combination count == 1. The second is the table of all cases with their percentages. You'll see that df1 calculates a single number that equals the total of all combinations have a count of 1. If you need to do this operation across multiple years (or something else), just think through the group by function as appropriate (you can group, ungroup, group again, ungroup again until your heart is content):

library(dplyr)

df <- iris

df1 <- df %>%
  group_by(Species, Petal.Width) %>%
  summarise(COUNT = n()) %>%
  ungroup() %>%
  mutate(PERCENT = COUNT/sum(COUNT)) %>%
  filter(COUNT == 1) %>%
  summarise(COUNT_1 = sum(PERCENT))


df2 <- df %>%
  group_by(Species, Petal.Width) %>%
  summarise(COUNT = n()) %>%
  ungroup() %>%
  mutate(PERCENT = COUNT/sum(COUNT))

df1

as.data.frame(df2)