1
votes

I have two vectors: 1) ~1000 sample means and 2) the corresponding ~1000 standard deviations of those means. I would like to create a kernel density plot of these data, using the sample means as the observations from which density is estimated, and the standard deviations of each mean as the bandwidth for each observation. Problem is, density only allows a vector of length 1 to be used as a bandwidth. For example:

plot(density(means,bw=error)) 

returns the following warnings:

1: In if (!is.finite(bw)) stop("non-finite 'bw'") :
  the condition has length > 1 and only the first element will be used
2: In if (bw <= 0) stop("'bw' is not positive.") :
  the condition has length > 1 and only the first element will be used
3: In if (!is.finite(from)) stop("non-finite 'from'") :
  the condition has length > 1 and only the first element will be used
4: In if (!is.finite(to)) stop("non-finite 'to'") :
  the condition has length > 1 and only the first element will be used

...and I get a plot that uses the error of the first item in the list as the bandwidth for all of my observations.

Any ideas on how I could implement a separate, user-defined bandwidth for each observation used to produce a kernel density plot?

1

1 Answers

0
votes

It doesn't look like density supports this sort of bandwidth specification. I suppose you could roll your own by

mydensity <- function(means, sds) {
  x <- seq(min(means - 3*sds), max(means + 3*sds), length.out=512)
  y <- sapply(x, function(v) mean(dnorm(v, means, sds)))
  cbind(x, y)
}

This will be a good deal slower than the real function (which appears to use fft in the computation). Here it is at work, with small bandwidths at the left and large at the right:

set.seed(144)
means <- runif(1000)
sds <- ifelse(means < 0.5, 0.001, 0.05)
plot(mydensity(means, sds))

enter image description here