I want to sample only some elements of a vector from a sum of gaussians that is given by their means and covariance matrices.
Specifically:
I'm imputing data using gaussian mixture model (GMM). I'm using the following procedure and sklearn:
- impute with mean
- get means and covariances with GMM (for example 5 components)
- take one of the samples and sample only the missing values. the other values stay the same.
- repeat a few times
There are two problems that I see with this. (A) how do I sample from the sum of gaussians, (B) how do I sample only part of the vector. I assume both can be solved at the same time. For (A), I can use rejection sampling or inverse transform sampling but I feel that there is a better way utilizing multivariate normal distribution generators in numpy. Or, some other efficient method. For (B), I just need to multiply the sampled variable by a gaussian that has known values from the sample as an argument. Right?
I would prefer a solution in python but an algorithm or pseudocode would be sufficient.