I have the following data frame:
dat <- read.table(text=" X prob
1 1 0.1
2 2 0.2
3 3 0.4
4 4 0.3", header=TRUE)
Is there any built-in function or elegant way to calulate mean and variance for discrete random variables in R?
I have the following data frame:
dat <- read.table(text=" X prob
1 1 0.1
2 2 0.2
3 3 0.4
4 4 0.3", header=TRUE)
Is there any built-in function or elegant way to calulate mean and variance for discrete random variables in R?
There is a weighted.mean
function in base R and the Hmisc package has a bunch of wtd.* functions.
> with(dat, weighted.mean(X, prob))
[1] 2.9
require(Hmisc)
> wtd.var(x=dat$X, weights=dat$prob)
[1] Inf
# Huh ? On investigation the weights argument is suppsed to be replicate weights
# So it's more appropriate to use normwt=TRUE
> wtd.var(x=dat$X, weights=dat$prob, normwt=TRUE)
[1] 1.186667
The survey package from Thomas Lumley provides much more than this simplistic example illustrates. It has the mechanism for handling complex weighting schemes for a variety of statistical modeling procedures:
require(survey)
> dclus1<-svydesign(id=~1, weights=~prob, data=dat)
> v<-svyvar(~X, dclus1)
> v
variance SE
X 1.1867 0.7011
These are sample statistics rather than the variances that would be calculated for abstract random variables. This result does seem appropriate for a statistical system, but might not be the correct answer for a probability homework question.