I have the following function:
func <- function(scores, labels, thresholds) {
labels <- if (is.data.frame(labels)) labels else data.frame(labels)
sapply(thresholds, function(t) { sapply(labels, function(lbl) { sum(lbl[which(scores >= t)]) }) })
}
I also have the following that I'll pass into func
.
> scores
[1] 0.187 0.975 0.566 0.793 0.524 0.481 0.005 0.756 0.062 0.124
> thresholds
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> var1
[1] 1 1 0 0 0 1 0 1 1 1
> df
var1 var2
1 1 0
2 1 1
3 0 0
4 0 0
5 0 0
6 1 1
7 0 1
8 1 1
9 1 1
10 1 0
Here are two different calls two func
, one with labels
as a vector, and the other with labels
as a data.frame:
> func(scores, var1, thresholds)
labels labels labels labels labels labels labels labels labels labels labels
6 5 3 3 3 2 2 2 1 1 0
> func(scores, df, thresholds)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
var1 6 5 3 3 3 2 2 2 1 1 0
var2 5 3 3 3 3 2 2 2 1 1 0
Why does "labels" get applied as a colname in the vector version, and "var1" and "var2" get applied as a rowname in the data.frame version?
What I'm looking for is the vector version to be more like:
> func(scores, var1, thresholds)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
labels 6 5 3 3 3 2 2 2 1 1 0
To create the variables above:
scores <- sample(seq(0, 1, 0.001), 10, replace = T)
thresholds <- seq(0, 1, 0.1)
var1 <- sample(c(0, 1), 10, replace = T)
var2 <- sample(c(0, 1), 10, replace = T)
df <- data.frame(var1, var2)
labels
to adata.frame
useas.data.frame
instead and see if that helps – Carles Mitjansdput(varName)
or simply something likescores <- c(0.187, 0.975, 0.566, 0.793, 0.524, 0.481, 0.005, 0.756, 0.062, 0.124)
. This makes it easier to replicate your problem and find a solution. – Barkerset.seed
-- the actual values are irrelevant here. – user451151which
is unnecessary. I can just dosapply(thresholds, function(t) { sapply(labels, function(lbl) { sum(lbl[scores >= t]) }) })
– user451151