0
votes

I have a data frame "table" that has a column with name "ID". The values of "ID" go from 1 to 100. The data frame also has a column with name "weight".

I have a function "calc_mean" that has a variable to select the "ID":

calc_mean <- function(id=1:100)

when I call my function, I want to be able to take a subset of this "ID", for example:

calc_mean(30:35)

this would have to calculate the mean of my column "weight" when writing the following code in my function:

mean(table$weight[,id])

but I get the following error:

[1] NA Warning message: In mean.default(table$weight[, id]) : argument is not numeric or logical: returning NA

What is wrong?

Alternatively, I would be happy if I could make a subset of this data frame "table" into another data frame called "table2" with only the ID's that interest me. I would think of the following code then:

for(i in id){
table2 <- table[table$ID == i,]
}

followed by:

mean(table2$weight)

however, this gives me the error:

[1] NA Warning message: In mean.default(table2$weight) : argument is not numeric or logical: returning NA

What is wrong here?

=============================================================================

Sorry, I wanted to hide my real code first for certain reasons but will now show the real code:

pollutantmean <- function(dummy_dir, pollutant, id = 1:332) {
pollutant <- c("sulfate", "nitrate")
directory <- "C:\\Users\\kieken\\Dropbox\\science\\R programming\\specdata"
setwd(directory)
files <- list.files(directory)
data.list <- lapply(files, read.csv)
data.cat <- do.call(rbind, data.list)
good <- complete.cases(data.cat)
data.clean <- data.cat[good,]

data.ID <- subset(data.clean, ID %in% id)
mean(data.ID[,pollutant])

}
pollutantmean("specdata", "nitrate", 70:72)

This code is giving me the following error:

[1] NA Warning message: In mean.default(data.ID[, pollutant]) : argument is not numeric or logical: returning NA

2
I noticed you didn't post the function.rawr
Please provide a reproducible example. It may be impossible to diagnose your problem if you don't post what we'd need to recreate your problem.Will Beason
I removed the following code and it worked now: pollutant <- c("sulfate", "nitrate")Chris De Corte

2 Answers

0
votes

This would have to calculate the mean of my column "weight" when writing the following code in my function:

mean(table$weight[,id])

The comma here doesn't make sense. table$weight is a vector, which means it has only one dimension, not two. Hence you should use mean(table$weight[id]).

for(i in id){ table2 <- table[table$ID == i,] } 

followed by:

mean(table2$weight)

Notice that each time you loop inside the for function, you are replacing table2 by another one with a different row from table. To create a subset you can either use

table2 <- table[id,]

or

table2 <- subset(table, ID %in% id)
1
votes

If you simply want to calculate the mean "weight" for a subset of "id", then you can just use with. The following code calculates the mean weight in a given range of "id"

# data example
table <- data.frame(id=1:100, weight=runif(100,60,95))

with(table, mean(weight[id %in% 30:35]))