0
votes

I am working with the tapply function in R. I am simply trying to get the tapply function to return the same results as the sapply function (The one I am pretty sure is correct).

GOAL:

I am working with the state.x77 data and trying to find the literacy rate of each region using the sapply and tapply functions.

CODE:

####Setting up the data
state.df = data.frame(state.x77, Region=state.region, Division=state.division)
state.by.region = split(state.df, f=state.region)
state.by.div = split(state.df, f=state.division)

####Tapply
tapply(state.df$Illiteracy, INDEX = state.region,FUN = function(v){
  li.rate = 100 - state.df$Illiteracy
  return(median(li.rate))
})

I see that I'm using different data frames for tapply. I think I SHOULD be using state.by.region but I simply can't get it to go. The best I can think of is:

tapply(state.by.region[,"Illiteracy"], INDEX = state.region, FUN = function(v){
  li.rate = 100 - state.by.region$Illiteracy
  return(median(li.rate))
})

What can I try next?

2

2 Answers

1
votes

In tapplys anonymous function you should subtract 100 by v and not state.df$Illiteracy as subtracting by v means you are only taking values for that Region and not complete dataframe. Also you don't need to split the data, you can refer the column name as INDEX.

tapply(state.df$Illiteracy, INDEX = state.df$Region,FUN = function(v){
      li.rate = 100 - v
      return(median(li.rate))
})

#    Northeast         South North Central          West 
#        98.90         98.25         99.30         99.40 
0
votes

Just adding another thought, since you said you thought you should be using "state.by.region". The documentation says that tapply takes vector like object, you can put "state.by.region" outside the tapply and within an sapply. This brings up to different form of the answer, but should still get what you want.

sapply(state.by.region, 
       function(v) tapply(v$Illiteracy, INDEX = v$Region, function(y) median(100-y)))

#               Northeast South North Central West
# Northeast          98.9    NA            NA   NA
# South                NA 98.25            NA   NA
# North Central        NA    NA          99.3   NA
# West                 NA    NA            NA 99.4